Skip to contents

Flag WW outliers

Usage

flag_ww_outliers(
  ww_data,
  conc_col_name = "log_genome_copies_per_ml",
  rho_threshold = 2,
  log_conc_threshold = 3,
  threshold_n_dps = 1
)

Arguments

ww_data

dataframe containing the following columns: site, lab, lab_site_index, date, a column for concentration, and below_lod

conc_col_name

string, name of the column containing the concentration measurements in the wastewater data, default is genome_copies_per_ml

rho_threshold

float indicating the z-score threshold for "jump"

log_conc_threshold

float indicating the z-score threshold for log concentration

threshold_n_dps

min number of data points above the LOD per lab-site

Value

ww_w_outliers_flaged dataframe containing all of the columns in ww_data input dataframe plus two additional columns: flag_as_ww_outlier and exclude flag as_ww_outlier contains a 0 if the datapoint is not an outlier and a 1 if it is an outlier. exclude tells the model whether or not to exclude that data point, which here is by default set to 0 for all data points (even those flagged as outliers). Excluding the outliers is a second optional step.

Examples

ww_data <- wwinference::ww_data
ww_data_preprocessed <- wwinference::preprocess_ww_data(ww_data)
ww_data_outliers_flagged <- flag_ww_outliers(ww_data_preprocessed)