Skip to contents

Incident cases are modeled as a smooth function of time with a generalized additive model (GAM). The model is fit with mgcv::gam() and some familiarity with mgcv may be helpful.

Usage

RtGam(
  cases,
  reference_date,
  group = NULL,
  day_of_week = TRUE,
  k = smooth_dim_heuristic(length(cases)),
  m = penalty_dim_heuristic(length(unique(reference_date))),
  backend = "gam",
  warn_for_diagnostic_failure = TRUE,
  ...
)

Arguments

cases

A vector of non-negative incident case counts occurring on an associated reference_date. Missing values (NAs) are not allowed.

reference_date

The associated date on which the count of incident cases occurred. Missing dates are not allowed and dates can only occur once.

group

The grouping variable for the case/reference-date pair. Not yet implemented and a value other than NULL will throw an error.

day_of_week

A boolean or a vector of custom values to be applied to the model as a random effect. If TRUE, then RtGam will use the parsed reference_date values to infer the day of week. If a vector of the same length as reference_date, then the user-supplied values are used as the day-of-week effect, overriding the actual day of week. This approach can be used to, for example, adjust for atypical reporting due to a holiday. If FALSE no day of week effect is applied.

k

An integer, the total dimension of all the smoothing basis functions. Defaults to smooth_dim_heuristic(length(cases)), which picks a reasonable estimate based on the number of provided data points. This total dimension is partitioned between the different smooths in the model. In the case of diagnostic issues in a fitted RtGam model, increasing the value of k above this default and refitting the model is a good first step. See the smooth_dim_heuristic() documentation for more information.

m

An integer, the dimension of the penalty basis for the global smooth trend. If m is greater than 1, the smooth's wiggliness can change over time. An increase in this value above the default should be done carefully. See penalty_dim_heuristic() for more information on m and when to consider changing the default.

backend

One of gam or bam; defaults to gam. In general, models should be fit with mgcv::gam(). If mgcv::gam() is too slow, mgcv::bam() converges more quickly but introduces some additional numerical error. Note that the bam backend uses the discrete = TRUE option for an additional speedup. See mgcv::bam() for more information.

warn_for_diagnostic_failure

Should warnings be issued for automatically identified diagnostic issues? Defaults to TRUE. A list of quantitative model diagnostics can be inspected in the diagnostics slot of the returned RtGam object.

...

Additional arguments passed to the specified modeling backend. For example, the default negative binomial error structure could be changed to poisson in the default mgcv::gam backend by passing family = "poisson".

Value

A fitted model object of type RtGam. The object has named elements:

  • model: The fitted mgcv model object

  • data: The processed data.frame used to fit the RtGam model

  • min_date and max_date: The minimum and maximum reference_date provided

  • k: The user-provided k argument

  • m: The user-provided m argument

  • backend: The user-provided backend argument

  • formula: The formula object provided to the model

  • diagnostics: The quantitative diagnostics of the model fit

Model specification

Incident cases (\(y\)) are modeled as smoothly changing over time:

$$\text{log}\{E(y)\} = \alpha + f_{\text{global}}(t) + \omega(\text{wday}(t))$$

where incidence is negative-binomially distributed, \(f(t)\) is a smooth function of time, and \(\omega(\text{wday}(t))\) is a random day-of-week effect.

References

Mellor, Jonathon, et al. "An Application of Nowcasting Methods: Cases of Norovirus during the Winter 2023/2024 in England." medRxiv (2024): 2024-07.

Ward, Thomas, et al. "Growth, reproduction numbers and factors affecting the spread of SARS-CoV-2 novel variants of concern in the UK from October 2020 to July 2021: a modelling analysis." BMJ open 11.11 (2021): e056636.

See also

smooth_dim_heuristic() more information on the smoothing basis dimension, mgcv::choose.k for more general guidance on GAMs from mgcv, and mgcv::gam/mgcv::bam for documentation on arguments to the model fitting functions.

Examples

withr::with_seed(12345, {
  cases <- rpois(20, 10)
})
reference_date <- seq.Date(
  from = as.Date("2023-01-01"),
  length.out = 20,
  by = "day"
)
fit <- RtGam(cases, reference_date)
fit
#> ===============================
#> Fitted RtGam model object (gam)
#> ===============================
#> 
#> Model type: Non-adaptive (m = 1)
#> Specified maximum smoothing basis dimension: 15 
#> Family: Negative Binomial(1039191.088) 
#> Link function: log
#> Using day-of-week effects
#> ===============================
#> 
#> Observed data points: 20
#> Distinct reference dates: 20
#> Distinct groups: 1
#> Day-of-week levels: 7
#>