The NSSP-ESSENCE Regression/EWMA Switch algorithm generalized the Regression and EWMA algorithms by applying the most appropriate algorithm for the data in the baseline. First, multiple adaptive regression is applied where the adjusted R squared value of the model is examined to see if it meets a threshold of 0.60. If this threshold is not met, then the model is considered to not explain the data well. In this case, the algorithm switches to the EWMA algorithm, which is more appropriate for sparser time series that are common with county level trends. The smoothing coefficient for the EWMA algorithm is fixed to 0.4.

alert_switch(df, t = date, y = count, B = 28, g = 2, w1 = 0.4, w2 = 0.9)

Arguments

df

A data frame, data frame extension (e.g. a tibble), or a lazy data frame.

t

Name of the column of type Date containing the dates

y

Name of the column of type Numeric containing counts or percentages

B

Baseline parameter. The baseline length is the number of days to which each liner model is fit (default is 28)

g

Guardband paramter. The guardband length is the number of days separating the baseline from the current date in consideration for alerting (default is 2)

w1

Smoothing coefficient for sensitivity to gradual events. Must be between 0 and 1 and is recommended to be between 0.3 and 0.5 to account for gradual effects. Defaults to 0.4 to match ESSENCE implementation.

w2

Smoothed coefficient for sensitivity to sudden events. Must be between 0 and 1 and is recommended to be above 0.7 to account for sudden events. Defaults to 0.9 to match ESSENCE implementation and approximate the C2 algorithm.

Value

A data frame

Examples

# Example 1
df <- data.frame(
  date = seq.Date(as.Date("2020-01-01"), as.Date("2020-12-31"), by = 1),
  count = floor(runif(366, min = 0, max = 101))
)

head(df)

df_switch <- alert_switch(df)

head(df_switch)

# Example 2
df <- data.frame(
  Date = seq.Date(as.Date("2020-01-01"), as.Date("2020-12-31"), by = 1),
  percent = runif(366)
)

head(df)

df_switch <- alert_switch(df, t = Date, y = percent)

head(df_switch)


if (FALSE) {
# Example 3: Data from NSSP-ESSENCE
library(Rnssp)
library(ggplot2)

myProfile <- create_profile()

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?
endDate=20Nov20&ccddCategory=cli%20cc%20with%20cli%20dd%20and%20coronavirus%20dd%20v2
&percentParam=ccddCategory&geographySystem=hospitaldhhsregion&datasource=va_hospdreg
&detector=probrepswitch&startDate=22Aug20&timeResolution=daily&hasBeenE=1
&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TimeSeries&stratVal=
&multiStratVal=geography&graphOnly=true&numSeries=0&graphOptions=multipleSmall
&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=January
&stratVal=&multiStratVal=geography&graphOnly=true&numSeries=0&graphOptions=multipleSmall
&seriesPerYear=false&startMonth=January&nonZeroComposite=false"

url <- url %>% gsub("\n", "", .)

api_data <- get_api_data(url)

df <- api_data$timeSeriesData

df_switch <- df %>%
  group_by(hospitaldhhsregion_display) %>%
  alert_switch(t = date, y = dataCount)

# Visualize alert for HHS Region 4
df_switch_region <- df_switch %>%
  filter(hospitaldhhsregion_display == "Region 4")

df_switch_region %>%
  ggplot() +
  geom_line(aes(x = date, y = dataCount), color = "grey70") +
  geom_line(
    data = subset(df_switch_region, alert != "grey"),
    aes(x = date, y = dataCount), color = "navy"
  ) +
  geom_point(
    data = subset(df_switch_region, alert == "blue"),
    aes(x = date, y = dataCount), color = "navy"
  ) +
  geom_point(
    data = subset(df_switch_region, alert == "yellow"),
    aes(x = date, y = dataCount), color = "yellow"
  ) +
  geom_point(
    data = subset(df_switch_region, alert == "red"),
    aes(x = date, y = dataCount), color = "red"
  ) +
  theme_bw() +
  labs(
    x = "Date",
    y = "Count"
  )
}