Plotting hubverse formatted output
Source:vignettes/plot-hub-submission.Rmd
plot-hub-submission.Rmd
After generating a hubverse formatted forecast, it is good to inspect
the output to make sure that the results make sense. To help you,
forecasttools
provides convenience functions for making
timeseries plots from hubverse-format output.
First, let’s load forecasttools
.
Quantile timeseries
Much hubverse formatted output is organized into quantiles. An easy
spotcheck plot shows how these quantiles evolve over the forecast
horizon. We can make it using the
plot_hubverse_file_quantiles()
function. There is only one
mandatory argument: the path to a properly hubverse-formatted
.csv
file. Let’s plot some inflenza forecasts submitted to
the 2023-24 FluSight Challenge by the
cfarenewal-cfaepimlight
team for the 2024-04-06 reference
date:
path_to_formatted_forecast <- "https://raw.githubusercontent.com/cdcepi/FluSight-forecast-hub/refs/heads/main/model-output/cfarenewal-cfaepimlight/2024-04-06-cfarenewal-cfaepimlight.csv"
plots <- plot_hubverse_file_quantiles(path_to_formatted_forecast)
#> Error in get(paste0(generic, ".", class), envir = get_method_env()) :
#> object 'type_sum.accel' not found
plot_hubverse_file_quantiles()
returns a list of all the
plots generated. By default, the list names (keys) are US Postal Service
style two-letter abbreviations. Let’s look at the national plot:
plots[["US"]]
We can also look at the plot for Colorado:
plots[["CO"]]
#> Warning in ggplot2::scale_y_continuous(transform = y_transform): log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
Of course, you may not wish to generate plots for all locations in
your hubverse-formatted file at once.
plot_hubverse_file_quantiles()
takes an optional
locations
argument that allows you to plot only a subset.
For example, let’s plot the “Four Corners”
states:
four_corners <- plot_hubverse_file_quantiles(path_to_formatted_forecast,
locations = c("AZ", "CO", "NM", "UT")
)
## display New Mexico
four_corners[["NM"]]
#> Warning in ggplot2::scale_y_continuous(transform = y_transform): log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
Many hubs provide “target data” or “truth data” of observed values of
the forecasting target, and so
plot_hubverse_file_quantiles()
optionally allows you to
plot this alongside the forecast data. Since this target data often goes
back years, it is useful to set a cutoff using the
start_date
argument. Here, we’ll start in December
2023.
target_data_path <- "https://raw.githubusercontent.com/cdcepi/FluSight-forecast-hub/04e884dce942dd3b8766aee3d8ff1c333b4fb6fa/target-data/target-hospital-admissions.csv"
plot_hubverse_file_quantiles(path_to_formatted_forecast,
locations = "US",
observed_data_path = target_data_path,
start_date = "2023-12-01"
)
#> $US
The function provides some basic customization of plotted lines and
points via the linewidth
, pointsize
,
forecast_linecolor
, forecast_pointcolor
,
obs_pointcolor
, and obs_linecolor
arguments.
It also defaults to plotting on a log10-scale y-axis, but this can be
changed by passing a different string to y_transform
; any
valid value for the transform =
argument of
ggplot2::scale_y_continuous()
can be passed.
## plot forecast data in green, with smaller points and
## lines, and plot on a linear scale
my_custom_plot <- plot_hubverse_file_quantiles(
path_to_formatted_forecast,
locations = "US",
observed_data_path = target_data_path,
start_date = "2023-12-01",
forecast_linecolor = "darkgreen",
forecast_pointcolor = "darkgreen",
pointsize = 1,
linewidth = 1,
y_transform = "identity"
)
my_custom_plot[["US"]]
For further customization, you can modify the resulting ggplot objects, as you would a regular ggplot. For example, we can convert the above plot to the classic ggplot2 theme.
library(ggplot2)
my_custom_plot[["US"]] + theme_classic()