Functions summary_
functions_summary.Rmd
summary_
functions summarize data and return metrics
related to them.
summary_cat
The goal of summary_cat
is to summarize categorical
variables.
set.seed(123);g <- c(sample(letters,100,replace = TRUE),NA)
summary_cat(g)
#> # A tibble: 1 × 6
#> n na blank_space n_distinct mode modality
#> <int> <int> <int> <int> <chr> <int>
#> 1 101 1 0 25 y 1
summary_num
The goal of summary_num
is to summarize numeric
variables.
set.seed(123);x <- c(rnorm(10),NA,10)
summary_num(x) %>% glimpse()
#> Rows: 1
#> Columns: 8
#> $ min <dbl> -1.265061
#> $ p25 <dbl> -0.5030688
#> $ p50 <dbl> 0.07050839
#> $ p75 <dbl> 1.009812
#> $ max <dbl> 10
#> $ mode <dbl> -0.2678934
#> $ mean <dbl> 0.9769324
#> $ cv <dbl> 3.2
It is also possible to return metrics related to type of values we have:
summary_num(x,type = TRUE) %>% glimpse()
#> Rows: 1
#> Columns: 13
#> $ n <int> 12
#> $ na <int> 1
#> $ negative <int> 5
#> $ equal_zero <int> 0
#> $ positive <int> 6
#> $ min <dbl> -1.265061
#> $ p25 <dbl> -0.5030688
#> $ p50 <dbl> 0.07050839
#> $ p75 <dbl> 1.009812
#> $ max <dbl> 10
#> $ mode <dbl> -0.2678934
#> $ mean <dbl> 0.9769324
#> $ cv <dbl> 3.2
We can also add other means.
summary_num(x,other_means = TRUE) %>% glimpse()
#> Warning in warn_any_logic(x = x, operator = `<`, value = 0, warning = "Negative
#> values will be ignored."): Negative values will be ignored.
#> Rows: 1
#> Columns: 10
#> $ min <dbl> -1.265061
#> $ p25 <dbl> -0.5030688
#> $ p50 <dbl> 0.07050839
#> $ p75 <dbl> 1.009812
#> $ max <dbl> 10
#> $ mode <dbl> -0.2678934
#> $ mean <dbl> 0.9769324
#> $ cv <dbl> 3.2
#> $ geometric_mean <dbl> 0.6946152
#> $ harmonic_mean <dbl> 0.7436103
summary_seq
The goal of summary_seq
is to compute the number of
sequential repeated values.
y <- c(1, 1, 1, 2, 2, 6, 7, 1, 1)
summary_seq(y)
#> # A tibble: 5 × 2
#> value num_rep
#> <dbl> <int>
#> 1 1 3
#> 2 2 2
#> 3 6 1
#> 4 7 1
#> 5 1 2
summary_xy
The goal of summary_xy
is to summary two numeric
variables, by computing some metrics such as: Pearson, Kendall and
Spearman correlation coefficients.
x <- rnorm(100)
y <- rnorm(100)
summary_xy(x,y)
#> # A tibble: 1 × 4
#> covariance pearson kendall spearman
#> <dbl> <dbl> <dbl> <dbl>
#> 1 -0.0654 -0.0724 -0.00970 -0.0144