Skip to contents

Pull a dataset from data.cdc.gov with standard selection and filtering options.

Usage

pull_data_cdc_gov_dataset(
  dataset,
  dataset_lookup_format = "key",
  api_key_id = Sys.getenv("DATA_CDC_GOV_API_KEY_ID"),
  api_key_secret = Sys.getenv("DATA_CDC_GOV_API_KEY_SECRET"),
  start_date = NULL,
  end_date = NULL,
  columns = NULL,
  locations = NULL,
  order_by = NULL,
  desc = FALSE,
  limit = 1e+05,
  error_on_limit = TRUE,
  rename_columns = FALSE,
  ...
)

Arguments

dataset

Dataset key or id (as one of the keys or ids in data_cdc_gov_dataset_table. Format determined by the value of dataset_lookup_format.

dataset_lookup_format

Format for the dataset string. One of "key" or "id". Default "key". See data_cdc_gov_dataset_lookup().

api_key_id

Key ID of an API key to use when querying the dataset. Not required, but polite and reduces throttling. You can create one at data.cdc.gov/profile/edit/developer_settings . Defaults to the value of the environment variable DATA_CDC_GOV_API_KEY_ID, if any.

api_key_secret

Associated key secret for the API key given in api_key_id. Defaults to the value of the environment variable DATA_CDC_GOV_API_KEY_SECRET, if any.

start_date

Pull only rows with dates greater than or equal to this date. If NULL, no minimum date. Default NULL.

end_date

Pull only rows with dates less than or equal to this date. If NULL, no maximum date. Default NULL.

columns

Vector of columns to retrieve, in addition to the location and date columns for the dataset, which are always retrieved if they exist. Default NULL.

locations

value or values to filter on for the dataset's location column. If NULL, do not filter on that column. Default NULL.

order_by

column or columns to order (sort) by. If NULL (default) will order first by the date column and then by the location column.

desc

Boolean. Whether to order descending instead of ascending. Default FALSE (order ascending).

limit

maximum number of rows to return. Default 1e5 (100000)

error_on_limit

Boolean. Raise an error if the number of rows returned is equal to the maximum? Default TRUE. This ensures that one does not silently end up with a subset of the total set of rows matching the query. If a subset is desired, one can set error_on_limit = FALSE.

rename_columns

Boolean. Rename the dataset-specific date and location columns to date and location, respectively? Default FALSE.

...

other arguments passed to data_cdc_gov_soda_query()

Value

the pulled data, as a tibble.