Arguments
- data
A validated data.frame that is the output of
st_validate()- subgroup
A character vector of subgrouping variables. By default, aggregate estimates are generated for the overall data, as well as age group, sex, and age group + sex subgroups.
- borderline
How should borderline results be treated? Default is as negative.
- add_ci
Boolean. Whether to add binomial proportion confidence interval. It is calculated using the Wilson score interval method through the
binom::binom.confint()function.- round_digits
Integer indicating the number of decimal places of the estimate. It is passed to the digits argument of
base::round().- test_combination
Not functional yet. When data is based on more than one assay, what is the relationship between those assays?
Examples
mydata <- dplyr::mutate(
sample_raw_data,
age = ifelse(age %in% c(-999, 999), NA, age)
)
validated_df <- st_validate(
mydata,
dataset_id = dataset_id,
id = id,
age_group = age_group,
age = age,
sex = sex,
adm0 = regions$adm0$Canada,
adm1 = regions$adm1$Canada$Alberta,
adm2 = regions$adm2$Canada$Alberta$Calgary,
collection_start_date = "2020-Mar-01",
collection_end_date = "15/8/2023",
test_id = assays$`SARS-CoV-2`$`ID.Vet - IgG - ID Screen`,
result = result,
result_cat = result_cat,
include_others = TRUE,
rmd_safe = TRUE
)
#> ── Mapping columns and validating data ─────────────────────────────────────────
#> ✔ age_group is a valid column. [521ms]
#> ✔ age is a valid column. [23ms]
#> ✔ sex is a valid column. [13ms]
#> ✔ adm0 is a valid string. [105ms]
#> ✔ adm1 is a valid string. [10ms]
#> ✔ adm2 is a valid string. [15ms]
#> ✔ collection_start_date is a valid scalar. [187ms]
#> ✔ collection_end_date is a valid scalar. [20ms]
#> ✔ test_id is a valid string. [8ms]
#> ✔ result is a valid column. [21ms]
#> ✔ result_cat is a valid column. [9ms]
#> ✔ dataset_id is a valid column. [3ms]
#> ✔ id is a valid column. [11ms]
#> ── Validation finished ─────────────────────────────────────────────────────────
#> Success! Validated data created.
st_aggregate(validated_df)
#> # A tibble: 26 × 27
#> dataset_id subgroup strata age_group age_min age_max sex pop_adj test_adj
#> <int> <chr> <chr> <chr> <dbl> <dbl> <chr> <lgl> <lgl>
#> 1 1 overall NA All NA NA All FALSE FALSE
#> 2 2 overall NA All NA NA All FALSE FALSE
#> 3 1 age_group 0-17 0-17 0 17 All FALSE FALSE
#> 4 1 age_group 18-64 18-64 18 64 All FALSE FALSE
#> 5 1 age_group 65+ 65+ NA NA All FALSE FALSE
#> 6 1 age_group NA NA NA NA All FALSE FALSE
#> 7 2 age_group 0-17 0-17 1 17 All FALSE FALSE
#> 8 2 age_group 18-64 18-64 21 57 All FALSE FALSE
#> 9 2 age_group 65+ 65+ NA NA All FALSE FALSE
#> 10 1 sex Female All NA NA Female FALSE FALSE
#> # ℹ 16 more rows
#> # ℹ 18 more variables: adm1 <chr>, adm2 <chr>, start_date <date>,
#> # end_date <date>, test_id_1 <chr>, test_id_2 <chr>, test_id_3 <chr>,
#> # test_combination <lgl>, numerator <dbl>, denominator <int>, seroprev <dbl>,
#> # seroprev_95_ci_lower <dbl>, seroprev_95_ci_upper <dbl>,
#> # ab_denominator <int>, ab_titer_min <dbl>, ab_titer_max <dbl>,
#> # ab_titer_mean <dbl>, ab_titer_sd <dbl>