Draw posterior samples from a fitted RtGam model

Generate posterior draws from an RtGam fit. Prediction dates can be specified flexibly using various approaches, and predictions can be drawn for different model parameters.

Usage

# S3 method for class 'RtGam'
predict(
  object,
  parameter = "obs_cases",
  horizon = NULL,
  min_date = NULL,
  max_date = NULL,
  day_of_week = TRUE,
  n = 100,
  mean_delay = NULL,
  gi_pmf = NULL,
  seed = 12345,
  ...
)

Arguments

object

An RtGam object created using the RtGam() function.

parameter

A character string specifying the prediction target. Options are "obs_cases" (observed cases), "r" (growth rate), or "Rt" (reproduction number). Default is "obs_cases".

horizon

Optional. An integer indicating the number of days to forecast beyond the last date in the model fit. For example, horizon = 7 predicts the next 7 days.

min_date, max_date

Optional. Date-like objects specifying the start and end of the prediction range. See Details for more information on their usage.

day_of_week

How to handle day-of-week effects when predicting obs_cases. Defaults to TRUE, which identifies and applies the fitted day-of-week effect, if possible. When automatic detection fails or different levels are desirable, custom levels can be applied with a vector of equal length to the number of desired dates. If FALSE, the day-of-week effect is turned off (i.e., set to zero). When predicting parameters other than obs_cases or object is an RtGam model that did not include day-of-week effects, the day-of-week effect is turned off and this argument is silently ignored.

n

An integer specifying the number of posterior samples to use for predictions. Default is 100.

mean_delay

Optional. An integer specifying the mean number of days between an individual becoming infected and their case being observed (e.g., through an emergency department visit or hospitalization). This value shifts the predictions to account for reporting delays. It is required when predicting "r" (growth rate) or "Rt" (reproduction number).

gi_pmf

Optional. A numeric vector specifying the generation interval probability mass function (PMF), required when parameter = "Rt". The PMF must be a proper probability distribution (summing to one) with the first element set to zero to exclude same-day transmission, as required by the renewal equation. For more information and tools to handle delay distributions, see the primarycensored package.

seed

An integer specifying the random seed for reproducibility. Default is 12345.

...

Additional arguments passed to the underlying sampling functions:

When parameter = "obs_cases", arguments are passed to gratia: posterior_samples.
When parameter = "r" or "Rt", arguments are passed to gratia: fitted_samples.

Value

A data frame in tidy format, where each row represents a posterior draw for a specific date, with the following columns:

reference_date: The prediction date.
.response: The predicted value for the target parameter.
.draw: The index of the posterior draw.

Example output:

  reference_date .response .draw
1     2023-01-01        18     1
2     2023-01-02        13     1
3     2023-01-03        21     1
4     2023-01-01        11     2
5     2023-01-02        19     2
6     2023-01-03        24     2

Details

Prediction dates can be defined in four ways:

Default Date Range: By using only the fit object, predictions are made across the full date range in the original model.
Using horizon: Extends predictions up to horizon days beyond the last date in the model fit.
Using min_date and horizon: Predictions start from min_date and extend up to horizon days after the fit’s last date.
Using min_date and max_date: Generates predictions for all dates within this specified range, inclusive.

The mean_delay parameter adjusts predictions for the temporal lag between infection and case observation. For example, if mean_delay = 5, the model assumes that observed cases reflect infections that occurred on average five days earlier. This adjustment ensures that estimates of growth rates ("r") and reproduction numbers ("Rt") align with the correct underlying temporal dynamics.

The parameter argument determines the type of predictions:

"obs_cases": Observed cases, including uncertainty from the model's fit.
"r": Growth rate, calculated using the centered difference between time steps.
"Rt": Reproduction number, incorporating delay distributions and convolution.

Samples are drawn from the posterior distribution of the fitted model using the gratia package. The model estimates basis function coefficients on the smooth terms $\widehat{\beta}$ and smoothing parameter(s) $\lambda$. The coefficients have the approximate joint posterior distribution $$\beta | \lambda \sim N(\widehat{\beta}, \mathbf{V}_{\widehat{\beta}})$$ where $\mathbf{V}_{\widehat{\beta}}$ is the smoothing-parameter uncertainty corrected covariance matrix of the basis function coefficients. We draw samples from this approximate posterior and multiply them by the dates of interest to generate posterior draws. If estimating "Rt" or "r" the day-of-week effect is excluded (i.e., set to zero). Further processing of these draws to generate parameters of interest is described below.

For the intrinsic growth rate, we draw one day before and one day after every day of interest. We difference these two days within the smooth to get growth in the period and divide by two to generate the discrete centered derivative.

For the Rt we map the estimated values on the linear predictor scale back to the response scale and shift the estimated cases by the mean delay to get the estimated incident infections ($I$). We use the incident infections and the generation interval probability mass function ($w$) to estimate Rt via the Cori method: $I_t / \sum_{s = 1}^{t} I_{t - s} w_s$

For observed incident cases, we apply the estimated negative binomial observation error to the posterior expected incident cases to generate posterior predicted incident cases.

References

Miller, David L. "Bayesian views of generalized additive modelling." arXiv preprint arXiv:1902.01330 (2021).

Gostic, Katelyn M., et al. "Practical considerations for measuring the effective reproductive number, Rt." PLoS computational biology 16.12 (2020): e1008409.

Simpson, Gavin L. "Gratia: An R package for exploring generalized additive models." arXiv preprint arXiv:2406.19082 (2024).

Cori A, Ferguson NM, Fraser C, Cauchemez S. A New Framework and Software to Estimate Time-Varying Reproduction Numbers During Epidemics. Am J Epidemiol. 2013;178(9):1505–12. pmid:24043437

Wood, Simon N. Generalized additive models: an introduction with R. chapman and hall/CRC, 2017.