Skip to contents

Function that allows the user to generate hospital admissions and site-level wastewater data directly from the generative model, specifying the conditions and parameters to generate from.

Usage

generate_simulated_data(
  site_level_inf_dynamics = TRUE,
  site_level_conc_dynamics = FALSE,
  r_in_weeks = c(rep(1.1, 5), rep(0.9, 5), 1 + 0.007 * 1:16),
  n_sites = 4,
  ww_pop_sites = c(4e+05, 2e+05, 1e+05, 50000),
  pop_size = 1e+06,
  n_lab_sites = 5,
  map_site_to_lab = c(1, 1, 2, 3, 4),
  ot = 90,
  nt = 9,
  forecast_time = 28,
  sim_start_date = ymd("2023-10-30"),
  hosp_wday_effect = c(0.95, 1.01, 1.02, 1.02, 1.01, 1, 0.99)/7,
  i0_over_n = 5e-04,
  initial_growth = 1e-04,
  sd_in_lab_level_multiplier = 0.25,
  mean_obs_error_in_ww_lab_site = 0.3,
  mean_reporting_freq = 1/7,
  sd_reporting_freq = 1/14,
  mean_reporting_latency = 7,
  sd_reporting_latency = 5,
  mean_log_lod = 3.8,
  sd_log_lod = 0.2,
  example_params_path = fs::path_package("extdata", "example_params.toml", package =
    "cfaforecastrenewalww")
)

Arguments

site_level_inf_dynamics

if TRUE then the toy data has variation in the site-level R(t), if FALSE, assumes same underlying R(t) for the state as in each site

site_level_conc_dynamics

if TRUE then the toy data has variation in the site-level concentration each day, if FALSE, then the relationship from infection to concentration in each site is the same across sites

r_in_weeks

The mean weekly R(t) that drives infection dynamics at the state- level. This gets jittered with random noise to add week-to-week variation.

n_sites

Number of sites

ww_pop_sites

Catchment area in each of those sites (order must match)

pop_size

Population size in the state

n_lab_sites

NUmber of unique combinations of labs and sites. Must be greater than or equal to n_sites

map_site_to_lab

Vector mapping the sites to the lab-sites in order of the sites

ot

observed time: length of hospital admissions calibration time in days

nt

nowcast time: length of time between last hospital admissions date and forecast date in days

forecast_time

duration of the forecast in days e.g. 28 days

sim_start_date

the start date of the simulation, used to get a weekday vector

hosp_wday_effect

a simplex of length 7 describing how the hospital admissions are spread out over a week, starting at Monday = 1

i0_over_n

the initial per capita infections in the state

initial_growth

exponential growth rate during the unobserved time

sd_in_lab_level_multiplier

standard deviation in the log of the site- lab level multiplier determining how much variation there is systematically in site-labs from the state mean

mean_obs_error_in_ww_lab_site

mean day to day variation in observed wastewater concentrations across all lab-sites

mean_reporting_freq

mean frequency of wastewater measurements across sites in per day (e.g. 1/7 is once per week)

sd_reporting_freq

standard deviation in the frequency of wastewater measurements across sites

mean_reporting_latency

mean time from forecast date to last wastewater sample collection date, across sites

sd_reporting_latency

standard deviation in the time from the forecast date to the last wastewater sample collection date, across sites

mean_log_lod

mean log of the LOD in each lab-site

sd_log_lod

standard deviation in the log of the LOD across sites

example_params_path

path to the toml file with the parameters to use to generate the simulated data

Value

a list containing two dataframes. example_df is a dataframe containing all the columns needed to get the stan data needed for the infection dynamics model. It contains values for every site-lab-day combination, with NAs when the wastewater concentrations aren't observed. Hospital admissions are therefore repeated N site-lab times. param_df is a single row data frame of all the static parameters used to generate the model