Skip to contents

Pull relevant epidemiological data from NHSN, defaulting to the data.cdc.gov public API endpoint.

Usage

pull_nhsn(
  api_endpoint = "https://data.cdc.gov/resource/mpgq-jmmr.json",
  api_key_id = Sys.getenv("NHSN_API_KEY_ID"),
  api_key_secret = Sys.getenv("NHSN_API_KEY_SECRET"),
  start_date = NULL,
  end_date = NULL,
  columns = NULL,
  jurisdictions = NULL,
  order_by = c("jurisdiction", "weekendingdate"),
  desc = FALSE,
  limit = 1e+05,
  error_on_limit = TRUE,
  ...
)

Arguments

api_endpoint

API endpoint to use. Defaults to the https:/.json Socrata endpoint for NHSN COVID, Influenza, and RSV Wednesday release of data on data.cdc.gov, namely data.cdc.gov/resource/mpgq-jmmr.json

api_key_id

Key ID of an API key to use when querying the dataset. Not required, but polite and reduces throttling. You can create one at data.cdc.gov/profile/edit/developer_settings . Defaults to the value of the environment variable NHSN_API_KEY_ID, if any.

api_key_secret

Associated key secret for the API key given in api_key_id. Defaults to the value of the environment variable NHSN_API_KEY_SECRET, if any.

start_date

Pull only rows with dates greater than or equal to this date. If NULL, no minimum date. Default NULL.

end_date

Pull only rows with dates less than or equal to this date. If NULL, no maximum date. Default NULL.

columns

Vector of columns to retrieve, in addition to weekendingdate and jurisdiction, which are always retrieved. If NULL, retrieve all columns. Default NULL.

jurisdictions

value or values to filter on for the jurisdiction column of the NHSN dataset. If NULL, do not filter on that column. Default NULL.

order_by

column or columns to order (sort) by. Default c("jurisdiction", "weekendingdate") (sort first by jurisdiction, then by date).

desc

Boolean. Whether to order descending instead of ascending. Default FALSE (order ascending).

limit

maximum number of rows to return. Default 1e5 (100000)

error_on_limit

Boolean. Raise an error if the number of rows returned is equal to the maximum? Default TRUE. This ensures that one does not silently end up with a subset of the total set of rows matching the query. If a subset is desired, one can set error_on_limit = FALSE.

...

other arguments passed to nhsn_soda_query()

Value

the pulled data, as a tibble.