Run crosstabs in R — cross

Create a frequencies table with multiple distinct grouping variables/banners

Usage

cross_freqs(
  dataset,
  group_vars,
  ...,
  stat = c("percent", "mean", "median", "min", "max", "quantile", "summary"),
  percentile = NULL,
  nas = TRUE,
  wt = NULL,
  prompt = FALSE,
  digits = 2,
  nas_group = TRUE,
  factor_group = FALSE,
  wide = FALSE,
  exclude_groups = FALSE,
  include_overall = FALSE
)

Arguments

dataset: A dataframe.
group_vars: Accepts a character vector of variable names. The variables by which you want to subset your freqeuncies. In a traditional crosstab, these would be the banner variables.
...: The unquoted names of a set of variables in the dataset. If nothing is specified, the function runs a frequency on every column in given dataset.
stat: Character, stat to run. Currently accepts 'percent,' 'mean,' 'median,' 'min,' 'max,' 'quantile,' and 'summary' (default: 'percent').
percentile: Double, for use when stat = 'quantile.' Input should be a real number x such that 0 <= x <= 100. Stands for percentile rank, which is a quantile relative to a 100-point scale. (default:NULL)
nas: Boolean, whether or not to include NAs in the tabulation (default: TRUE).
wt: The unquoted name of a weighting variable in the dataset (default: NULL).
prompt: Boolean, whether or not to include the prompt in the dataset (default: FALSE).
digits: Integer, number of significant digits for rounding (default: 2).
nas_group: Boolean, whether or not to include NA values for the grouping variable in the tabulation (default: TRUE).
factor_group: Boolean, whether or not to convert the grouping variable to a factor and use its labels instead of its underlying numeric values (default: FALSE)
wide: Boolean, whether the dataframe should be one long dataframe (FALSE) or a wide and nested dataframe, nested on the group_vars (TRUE) (default: FALSE)
exclude_groups: Boolean, argument only applies if group_vars are also included as freqs vars - group_vars are included as freqs vars if using select() to run cross_freqs on all variables in the dataset. FALSE will INclude group_vars as freqs vars. TRUE will EXclude group_vars from also being freqs vars (default: FALSE)
include_overall: Boolean, whether to include the overall frequency levels for variables (default = FALSE)

Value

A dataframe with the variable names, prompts, values, labels, counts, stats, and resulting calculations, split out by subgroups (group_vars).

Examples

GROUP_VARS <-
  mtcars |>
  dplyr::select(
    am,
    vs
  ) |>
  names()

GROUP_VARS <- c("am", "vs")

mtcars |> cross_freqs(
  group_vars = GROUP_VARS,
  gear,
  carb
)
#> Adding missing grouping variables: `am`
#> Adding missing grouping variables: `am`
#> Adding missing grouping variables: `vs`
#> Adding missing grouping variables: `vs`
#> # A tibble: 27 × 8
#>    group_var_name group_var variable value label     n stat    result
#>    <chr>          <fct>     <chr>    <chr> <chr> <int> <chr>    <dbl>
#>  1 am             0         gear     3     3        15 percent   0.79
#>  2 am             0         gear     4     4         4 percent   0.21
#>  3 am             1         gear     4     4         8 percent   0.62
#>  4 am             1         gear     5     5         5 percent   0.38
#>  5 am             0         carb     1     1         3 percent   0.16
#>  6 am             0         carb     2     2         6 percent   0.32
#>  7 am             0         carb     3     3         3 percent   0.16
#>  8 am             0         carb     4     4         7 percent   0.37
#>  9 am             1         carb     1     1         4 percent   0.31
#> 10 am             1         carb     2     2         4 percent   0.31
#> # ℹ 17 more rows