Flag WW outliers
Usage
flag_ww_outliers(
ww_data,
conc_col_name = "log_genome_copies_per_ml",
rho_threshold = 2,
log_conc_threshold = 3,
threshold_n_dps = 1
)
Arguments
- ww_data
dataframe containing the following columns: site, lab, lab_site_index, date, a column for concentration, and below_lod
- conc_col_name
string, name of the column containing the concentration measurements in the wastewater data, default is
genome_copies_per_ml
- rho_threshold
float indicating the z-score threshold for "jump"
- log_conc_threshold
float indicating the z-score threshold for log concentration
- threshold_n_dps
min number of data points above the LOD per lab-site
Value
ww_w_outliers_flaged dataframe containing all of the columns in
ww_data input dataframe plus two additional columns:
flag_as_ww_outlier
and exclude
flag as_ww_outlier
contains a 0 if the datapoint is not an outlier and a 1
if it is an outlier. exclude
tells the model whether or not to exclude that
data point, which here is by default set to 0 for all data points (even
those flagged as outliers). Excluding the outliers is a second optional
step.
Examples
ww_data <- wwinference::ww_data
ww_data_preprocessed <- wwinference::preprocess_ww_data(ww_data)
ww_data_outliers_flagged <- flag_ww_outliers(ww_data_preprocessed)