Plotting hubverse formatted output • forecasttools

After generating a hubverse formatted forecast, it is good to inspect the output to make sure that the results make sense. To help you, forecasttools provides convenience functions for making timeseries plots from hubverse-format output.

First, let’s load forecasttools.

library(forecasttools)

Quantile timeseries

Much hubverse formatted output is organized into quantiles. An easy spotcheck plot shows how these quantiles evolve over the forecast horizon. We can make it using the plot_hubverse_file_quantiles() function. There is only one mandatory argument: the path to a properly hubverse-formatted .csv file. Let’s plot some inflenza forecasts submitted to the 2023-24 FluSight Challenge by the cfarenewal-cfaepimlight team for the 2024-04-06 reference date:

path_to_formatted_forecast <- "https://raw.githubusercontent.com/cdcepi/FluSight-forecast-hub/refs/heads/main/model-output/cfarenewal-cfaepimlight/2024-04-06-cfarenewal-cfaepimlight.csv"
plots <- plot_hubverse_file_quantiles(path_to_formatted_forecast)

plot_hubverse_file_quantiles() returns a list of all the plots generated. By default, the list names (keys) are US Postal Service style two-letter abbreviations. Let’s look at the national plot:

plots[["US"]]

We can also look at the plot for Colorado:

plots[["CO"]]
#> Warning in ggplot2::scale_y_continuous(transform = y_transform): log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.

Of course, you may not wish to generate plots for all locations in your hubverse-formatted file at once. plot_hubverse_file_quantiles() takes an optional locations argument that allows you to plot only a subset. For example, let’s plot the “Four Corners” states:

four_corners <- plot_hubverse_file_quantiles(path_to_formatted_forecast,
  locations = c("AZ", "CO", "NM", "UT")
)

## display New Mexico
four_corners[["NM"]]
#> Warning in ggplot2::scale_y_continuous(transform = y_transform): log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.

Many hubs provide “target data” or “truth data” of observed values of the forecasting target, and so plot_hubverse_file_quantiles() optionally allows you to plot this alongside the forecast data. Since this target data often goes back years, it is useful to set a cutoff using the start_date argument. Here, we’ll start in December 2023.

target_data_path <- "https://raw.githubusercontent.com/cdcepi/FluSight-forecast-hub/04e884dce942dd3b8766aee3d8ff1c333b4fb6fa/target-data/target-hospital-admissions.csv"
plot_hubverse_file_quantiles(path_to_formatted_forecast,
  locations = "US",
  observed_data_path = target_data_path,
  start_date = "2023-12-01"
)
#> $US

The function provides some basic customization of plotted lines and points via the linewidth, pointsize, forecast_linecolor, forecast_pointcolor, obs_pointcolor, and obs_linecolor arguments. It also defaults to plotting on a log10-scale y-axis, but this can be changed by passing a different string to y_transform; any valid value for the transform = argument of ggplot2::scale_y_continuous() can be passed.

## plot forecast data in green, with smaller points and
## lines, and plot on a linear scale
my_custom_plot <- plot_hubverse_file_quantiles(
  path_to_formatted_forecast,
  locations = "US",
  observed_data_path = target_data_path,
  start_date = "2023-12-01",
  forecast_linecolor = "darkgreen",
  forecast_pointcolor = "darkgreen",
  pointsize = 1,
  linewidth = 1,
  y_transform = "identity"
)
my_custom_plot[["US"]]

For further customization, you can modify the resulting ggplot objects, as you would a regular ggplot. For example, we can convert the above plot to the classic ggplot2 theme.

library(ggplot2)
my_custom_plot[["US"]] + theme_classic()