Incident cases are modeled as a smooth function of time with a generalized
additive model (GAM). The model is fit with mgcv::gam()
and some
familiarity with mgcv
may be helpful.
Usage
RtGam(
cases,
reference_date,
group = NULL,
day_of_week = TRUE,
k = smooth_dim_heuristic(length(cases)),
m = penalty_dim_heuristic(length(unique(reference_date))),
backend = "gam",
warn_for_diagnostic_failure = TRUE,
...
)
Arguments
- cases
A vector of non-negative incident case counts occurring on an associated
reference_date
. Missing values (NAs) are not allowed.- reference_date
The associated date on which the count of incident
cases
occurred. Missing dates are not allowed and dates can only occur once.- group
The grouping variable for the case/reference-date pair. Not yet implemented and a value other than
NULL
will throw an error.- day_of_week
A boolean or a vector of custom values to be applied to the model as a random effect. If
TRUE
, thenRtGam
will use the parsedreference_date
values to infer the day of week. If a vector of the same length asreference_date
, then the user-supplied values are used as the day-of-week effect, overriding the actual day of week. This approach can be used to, for example, adjust for atypical reporting due to a holiday. IfFALSE
no day of week effect is applied.- k
An integer, the total dimension of all the smoothing basis functions. Defaults to
smooth_dim_heuristic(length(cases))
, which picks a reasonable estimate based on the number of provided data points. This total dimension is partitioned between the different smooths in the model. In the case of diagnostic issues in a fittedRtGam
model, increasing the value ofk
above this default and refitting the model is a good first step. See thesmooth_dim_heuristic()
documentation for more information.- m
An integer, the dimension of the penalty basis for the global smooth trend. If
m
is greater than 1, the smooth's wiggliness can change over time. An increase in this value above the default should be done carefully. Seepenalty_dim_heuristic()
for more information onm
and when to consider changing the default.- backend
One of
gam
orbam
; defaults togam
. In general, models should be fit withmgcv::gam()
. Ifmgcv::gam()
is too slow,mgcv::bam()
converges more quickly but introduces some additional numerical error. Note that thebam
backend uses thediscrete = TRUE
option for an additional speedup. Seemgcv::bam()
for more information.- warn_for_diagnostic_failure
Should warnings be issued for automatically identified diagnostic issues? Defaults to TRUE. A list of quantitative model diagnostics can be inspected in the
diagnostics
slot of the returnedRtGam
object.- ...
Additional arguments passed to the specified modeling backend. For example, the default negative binomial error structure could be changed to poisson in the default mgcv::gam backend by passing
family = "poisson"
.
Value
A fitted model object of type RtGam
. The object has named elements:
model: The fitted mgcv model object
data: The processed data.frame used to fit the
RtGam
modelmin_date and max_date: The minimum and maximum
reference_date
providedk: The user-provided
k
argumentm: The user-provided
m
argumentbackend: The user-provided
backend
argumentformula: The formula object provided to the model
diagnostics: The quantitative diagnostics of the model fit
Model specification
Incident cases (\(y\)) are modeled as smoothly changing over time:
$$\text{log}\{E(y)\} = \alpha + f_{\text{global}}(t) + \omega(\text{wday}(t))$$
where incidence is negative-binomially distributed, \(f(t)\) is a smooth function of time, and \(\omega(\text{wday}(t))\) is a random day-of-week effect.
References
Mellor, Jonathon, et al. "An Application of Nowcasting Methods: Cases of
Norovirus during the Winter 2023/2024 in England." medRxiv (2024): 2024-07.
Ward, Thomas, et al. "Growth, reproduction numbers and factors affecting the
spread of SARS-CoV-2 novel variants of concern in the UK from October 2020
to July 2021: a modelling analysis." BMJ open 11.11 (2021): e056636.
See also
smooth_dim_heuristic()
more information on the smoothing basis
dimension, mgcv::choose.k for more general guidance on GAMs from mgcv
,
and mgcv::gam/mgcv::bam for documentation on arguments to the model
fitting functions.
Examples
withr::with_seed(12345, {
cases <- rpois(20, 10)
})
reference_date <- seq.Date(
from = as.Date("2023-01-01"),
length.out = 20,
by = "day"
)
fit <- RtGam(cases, reference_date)
fit
#> ===============================
#> Fitted RtGam model object (gam)
#> ===============================
#>
#> Model type: Non-adaptive (m = 1)
#> Specified maximum smoothing basis dimension: 15
#> Family: Negative Binomial(1039191.088)
#> Link function: log
#> Using day-of-week effects
#> ===============================
#>
#> Observed data points: 20
#> Distinct reference dates: 20
#> Distinct groups: 1
#> Day-of-week levels: 7
#>