Joint inference of count data (e.g. cases/admissions) and wastewater data

Provides a user friendly interface around package functionality to produce estimates, nowcasts, and forecasts pertaining to user-specified delay distributions, set parameters, and priors that can be modified to handle different types of "global" count data and "local" wastewater concentration data using a Bayesian hierarchical framework applied to the two distinct data sources. By default the model assumes a fixed generation interval and delay from infection to the event that is counted. See the getting started vignette for an example model specifications fitting COVID-19 hospital admissions from a hypothetical state and wasteawter concentration data from multiple sites within that state.

Usage

wwinference(
  ww_data,
  count_data,
  forecast_date = NULL,
  calibration_time = 90,
  forecast_horizon = 28,
  model_spec = get_model_spec(),
  fit_opts = list(),
  generate_initial_values = TRUE,
  initial_values_seed = NULL,
  compiled_model = compile_model()
)

# S3 method for class 'wwinference_fit'
print(x, ...)

# S3 method for class 'wwinference_fit'
summary(object, ...)

Arguments

ww_data: A dataframe containing the pre-processed, site-level wastewater concentration data for a model run. The dataframe must contain the following columns: date, site, lab, log_genome_copies_per_ml, lab_site_index, log_lod, below_lod, site_pop exclude.
count_data: A dataframe containing the pre-procssed, "global" (e.g. state) daily count data, pertaining to the number of events that are being counted on that day, e.g. number of daily cases or daily hospital admissions. Must contain the following columns: date, count , total_pop
forecast_date: a character string in ISO8601 format (YYYY-MM-DD) indicating the date that the forecast is to be made. Default is NULL
calibration_time: integer indicating the number of days to calibrate the model for, default is 90
forecast_horizon: integer indicating the number of days, including the forecast date, to produce forecasts for, default is 28
model_spec: The model specification parameters as defined using get_model_spec(). The default here pertains to the forecast_date in the example data provided by the package, but this should be specified by the user based on the date they are producing a forecast
fit_opts: MCMC fitting options, as a list of keys and values. These are passed as keyword arguments to compiled_model$sample(). Where no option is specified, wwinference() will fall back first on a package-specific default value given by get_mcmc_options(), if one exists. If no package-specific default exists, wwinference() will fall back on the default value defined in $sample(). See the documentation for $sample() for details on available options.
generate_initial_values: Boolean indicating whether or not to specify the initialization of the sampler, default is TRUE, meaning that initialization lists will be generated and passed as the init argument to the model object $sample() call. function
initial_values_seed: set of integers indicating the random seed of the R sampler of the initial values, default is NULL
compiled_model: The pre-compiled model as defined using compile_model()
x, object: Object of class wwinference_fit
...: Additional parameters passed to the corresponding method

Value

An object of the ww_inference_fit class containing the following items that are intended to be passed to downstream functions to do things like extract posterior draws, get diangostic behavior, and plot results (for example). If the model runs, this function will return: fit: The CmdStan object that is returned from the call to cmdstanr::sample(). Can be used to access draws, summary, diagnostics, etc. raw_input_data: a list containing the input ww_data and the input count_data used in the model. stan_data_list: a list containing the inputs passed directly to the stan model fit_opts: a list of the MCMC specifications passed to stan

If the model fails to run, a list containing the follow will be returned: error: the error message provided from stan, indicating why the model failed to run. Note, the model might still run and produce draws even if it has major model issues. We recommend the user always run the check_diagnostics() function on the parameter_diagnostics as part of any pipeline to ensure model convergence.

The print method prints out information about the model and returns the object invisibly.

The summary method returns the outcome from the $summary (cmdstanr::summary()) function.

Examples

if (FALSE) { # \dontrun{
ww_data <- tibble::tibble(
  date = rep(seq(
    from = lubridate::ymd("2023-08-01"),
    to = lubridate::ymd("2023-11-01"),
    by = "weeks"
  ), 2),
  site = c(rep(1, 14), rep(2, 14)),
  lab = c(rep(1, 28)),
  conc = log(abs(rnorm(28, mean = 500, sd = 50))),
  lod = log(c(rep(20, 14), rep(15, 14))),
  site_pop = c(rep(2e5, 14), rep(4e5, 14))
)

ww_data_preprocessed <- preprocess_ww_data(ww_data,
  conc_col_name = "conc",
  lod_col_name = "lod"
)
input_ww_data <- indicate_ww_exclusions(ww_data_preprocessed)

hosp_data <- tibble::tibble(
  date = seq(
    from = lubridate::ymd("2023-07-01"),
    to = lubridate::ymd("2023-10-30"),
    by = "days"
  ),
  daily_admits = sample(5:70, 122, replace = TRUE),
  state_pop = rep(1e6, 122)
)

input_count_data <- preprocess_count_data(
  hosp_data,
  "daily_admits",
  "state_pop"
)

generation_interval <- to_simplex(c(0.01, 0.2, 0.3, 0.2, 0.1, 0.1, 0.01))
inf_to_count_delay <- to_simplex(c(
  rep(0.01, 12), rep(0.2, 4),
  rep(0.01, 10)
))
infection_feedback_pmf <- generation_interval

params <- get_params(
  system.file("extdata", "example_params.toml",
    package = "wwinference"
  )
)
forecast_date <- "2023-11-06"
calibration_time <- 90
forecast_horizon <- 28
include_ww <- 1

ww_fit <- wwinference(
  ww_data = input_ww_data,
  count_data = input_count_data,
  forecast_date = forecast_date,
  calibration_time = calibration_time,
  forecast_horizon = forecast_horizon,
  model_spec = get_model_spec(
    generation_interval = generation_interval,
    inf_to_count_delay = inf_to_count_delay,
    infection_feedback_pmf = infection_feedback_pmf,
    params = params
  ),
  fit_opts = list(
    iter_warmup = 250,
    iter_sampling = 250,
    chains = 2
  )
)
} # }

Joint inference of count data (e.g. cases/admissions) and wastewater data

Usage

Arguments

Value

See also

Examples