Propose penalty basis dimension from the number of distinct dates
Source:R/RtGam.R
penalty_dim_heuristic.Rd
Return a reasonable value for the m
argument of RtGam()
based on the
number of dates that cases are observed. The m
argument controls the
dimension of the smoothing penalty basis for the model's global smooth trend
(see the Model specification section of the RtGam()
documentation for
more information about the global trend). The penalty basis dimension
controls how much the wiggliness of the global smooth trend can vary over
time. Higher values of m
help the model to adapt quickly to different
epidemic regimes, but are computationally costly.
How m
is used
The parameter m
controls the penalty basis dimension of the model's global
smooth trend. If m
is 1, there will be single constant penalty on
wiggliness over the entire smooth and RtGam will use a thin-plate spline
basis for its superior performance in single-penalty settings. If m
is 2 or
more, the model will use m
distinct penalties on the smooth trend's
wiggliness and use an adaptive spline basis. The realized penalty at each
timepoint smoothly interpolates between the m
estimated wiggliness
penalties. This adaptive penalty increases the computational cost of the
model, but allows for a single model to adapt to changing epidemic dynamics
without oversmoothing or introducing spurious wiggly trends.
When to use a different value
Very slow
Decreasing the penalty basis dimension makes the model less demanding to fit.
mgcv
describes an adaptive penalty with 10 basis dimensions and 200 data
points as roughly equivalent to fitting 10 GAMs each from 20 data points.
Using a single penalty throughout the model is much simpler than using an
adaptive smooth and should be preferred where possible. See
[mgcv::smooth.construct.ad.smooth.spec]
for more information on how the
adaptive smooth basis uses the penalty dimension.
Observed over-smoothing of non-stationary data
If a fitted model is observably over-smoothing, it may be reasonable to refit with a higher penalty basis dimension. Moments with a sudden change in epidemic dynamics, such as a sharp epidemic peak, can be challenging to fit with smooth functions. This option should be used with care due to the increased computational cost.
Implementation details
The algorithm to pick m
is \(\lfloor \frac{n}{21} \rfloor + 1\) where
\(n \in \mathbb{W}\) is the number of observed dates. This algorithm
assumes that over a 21-day period, epidemic dynamics remain roughly similarly
wiggly. Sharp jumps or drops requiring a very wiggly trend would remain
similarly plausible over much of the 21-day band.
See also
RtGam()
for the use-case and additional documentation as well as
mgcv::smooth.construct.ad.smooth.spec for an explanation of the
underlying adaptive-smooth machinery.