Incident cases are modeled as a smooth function of time with a generalized
additive model (GAM). The model is fit with mgcv::gam()
and some
familiarity with mgcv
may be helpful.
Usage
RtGam(
cases,
reference_date,
group = NULL,
k = smooth_dim_heuristic(length(cases)),
m = penalty_dim_heuristic(length(unique(reference_date))),
backend = "gam",
warn_for_diagnostic_failure = TRUE,
...
)
Arguments
- cases
A vector of non-negative incident case counts occurring on an associated
reference_date
. Missing values (NAs) are not allowed.- reference_date
The associated date on which the count of incident
cases
occurred. Missing dates are not allowed and dates can only occur once.- group
The grouping variable for the case/reference-date pair. Not yet implemented and a value other than
NULL
will throw an error.- k
An integer, the total dimension of all the smoothing basis functions. Defaults to
smooth_dim_heuristic(length(cases))
, which picks a reasonable estimate based on the number of provided data points. This total dimension is partitioned between the different smooths in the model. In the case of diagnostic issues in a fittedRtGam
model, increasing the value ofk
above this default and refitting the model is a good first step. See thesmooth_dim_heuristic()
documentation for more information.- m
An integer, the dimension of the penalty basis for the global smooth trend. If
m
is greater than 1, the smooth's wiggliness can change over time. An increase in this value above the default should be done carefully. Seepenalty_dim_heuristic()
for more information onm
and when to consider changing the default.- backend
One of
gam
orbam
; defaults togam
. In general, models should be fit withmgcv::gam()
. Ifmgcv::gam()
is too slow,mgcv::bam()
converges more quickly but introduces some additional numerical error. Note that thebam
backend uses thediscrete = TRUE
option for an additional speedup. Seemgcv::bam()
for more information.- warn_for_diagnostic_failure
Should warnings be issued for automatically identified diagnostic issues? Defaults to TRUE. A list of quantitative model diagnostics can be inspected in the
diagnostics
slot of the returnedRtGam
object.- ...
Additional arguments passed to the specified modeling backend. For example, the default negative binomial error structure could be changed to poisson in the default mgcv::gam backend by passing
family = "poisson"
.
Value
A fitted model object of type RtGam
. The object has named elements:
model: The fitted mgcv model object
data: The processed data.frame used to fit the
RtGam
modelmin_date and max_date: The minimum and maximum
reference_date
providedk: The user-provided
k
argumentm: The user-provided
m
argumentbackend: The user-provided
backend
argumentformula: The formula object provided to the model
diagnostics: The quantitative diagnostics of the model fit
Model specification
Incident cases (\(y\)) are modeled as smoothly changing over time:
$$\text{log}\{E(y)\} = \alpha + f_{\text{global}}(t)$$
where incidence is negative-binomially distributed and \(f(t)\) is a smooth function of time.
See also
smooth_dim_heuristic()
more information on the smoothing basis
dimension, mgcv::choose.k for more general guidance on GAMs from mgcv
,
and mgcv::gam/mgcv::bam for documentation on arguments to the model
fitting functions.
Examples
withr::with_seed(12345, {
cases <- rpois(20, 10)
})
reference_date <- seq.Date(
from = as.Date("2023-01-01"),
length.out = 20,
by = "day"
)
fit <- RtGam(cases, reference_date)
fit
#> ===============================
#> Fitted RtGam model object (gam)
#> ===============================
#>
#> Model type: Non-adaptive (m = 1)
#> Specified maximum smoothing basis dimension: 15
#> Family: Negative Binomial(993826.867)
#> Link function: log
#> ===============================
#>
#> Observed data points: 20
#> Distinct reference dates: 20
#> Distinct groups: 1
#>