Pre-process wastewater input data, adding needed indices and flagging potential outliers
Source:R/preprocessing.R
preprocess_ww_data.Rd
Pre-process wastewater input data, adding needed indices and flagging potential outliers
Usage
preprocess_ww_data(
ww_data,
conc_col_name = "log_genome_copies_per_ml",
lod_col_name = "log_lod"
)
Arguments
- ww_data
dataframe containing the following columns: site, lab, date, site_pop, a column for concentration, and a column for the limit of detection
- conc_col_name
string indicating the name of the column containing virus genome concentration measurements in log genome copies per mL, default is
log_genome_copies_per_ml
- lod_col_name
string indicating the name of the column containing the limits of detection for each wastewater measurement, default is
log_lod_sewage
. Note that any values in theconc_col_name
equal to the limit of detection will be treated as below the limit of detection.
Value
a dataframe containing the same columns as ww_data except
the conc_col_name
will be replaced with log_genome_copies_per_ml
and
the lod_col_name
will be replaced with log_lod_sewage
plus the following
additional columns needed for the stan model:
lab_site_index, site_index, flag_as_ww_outlier, below_lod, lab_site_name,
exclude
Examples
ww_data <- tibble::tibble(
date = lubridate::ymd(rep(c("2023-11-01", "2023-11-02"), 2)),
site = c(rep(1, 2), rep(2, 2)),
lab = c(1, 1, 1, 1),
log_conc = log(c(345.2, 784.1, 401.5, 681.8)),
log_lod = log(c(20, 20, 15, 15)),
site_pop = c(rep(2e5, 2), rep(4e5, 2))
)
ww_data_preprocessed <- preprocess_ww_data(ww_data,
conc_col_name = "log_conc",
lod_col_name = "log_lod"
)