Run an Rt Estimation Model Pipeline
pipeline.Rd
This function runs a complete pipeline for fitting an Rt estimation model,
using the EpiNow2
model, based on a configuration file. The pipeline
processes the model, logs its progress, and handles errors by logging
warnings and setting the pipeline status. Output and logs are written to
the specified directories. Additionally, support for uploading logs and
outputs to a blob storage container is planned.
Usage
orchestrate_pipeline(
config_path,
output_container = NULL,
config_container = NULL,
input_dir = "/input",
output_dir = "/output"
)
execute_model_logic(config, input_dir, output_dir)
Arguments
- config_path
A string specifying the file path to the JSON configuration file.
- output_container
Optional. The name of the blob storage container to which logs and outputs will be uploaded. If NULL, no upload will occur.
- config_container
Optional. The name of the blob storage container from which the config file will be downloaded.
- input_dir
A string specifying the directory to read inputs from. If passing storage containers, this is where the files will be downloaded to.
- output_dir
A string specifying the directory where output, logs, and other pipeline artifacts will be saved. Defaults to the root directory ("/").
- config
A Config object containing configuration settings for the pipeline, including paths to data, exclusions, disease parameters, model settings, and other necessary inputs.
Value
The function returns a boolean, TRUE For pipeline success and FALSE otherwise. It writes the files: directory will contain the following files:
Model RDS file (
model.rds
)Sample output in Parquet format (
<task_id>.parquet
in thesamples/
directory)Summary output in Parquet format (
<task_id>.parquet
in thesummaries/
directory)Log file (
logs.txt
) in the task directory
Returns TRUE
on success. Errors are caught by the outer pipeline
logic and logged accordingly.
Details
The function reads the configuration from a JSON file and uses this to set
up the job and task identifiers. It creates an output directory structure
based on these IDs and starts logging the process in a file. The main
pipeline process is handled by execute_model_logic()
, with errors
caught and logged as warnings. The function will log the success or
failure of the run.
Logs are written to a file in the output directory, and console output is also mirrored in this log file. Error handling is in place to capture any issues during the pipeline execution and ensure they are logged appropriately.
During the execution of the pipeline, the following output files are expected to be generated:
Model Output: An RDS file of the fitted model is saved in the task-specific directory (
model.rds
).Samples: Parquet files containing the model's sample outputs are saved in a
samples
subdirectory, named using thetask_id
(e.g.,task_id.parquet
).Summaries: Parquet files summarizing the model's results are saved in a
summaries
subdirectory, also named using thetask_id
(e.g.,task_id.parquet
).Logs: A
logs.txt
file is generated in the task directory, capturing both console and error messages.
The output directory structure will follow this format:
<output_dir>/
└── <job_id>/
├── samples/
│ └── <task_id>.parquet
├── summaries/
│ └── <task_id>.parquet
└── tasks/
└── <task_id>/
├── model.rds
└── logs.txt
This function performs the core model fitting process within the Rt estimation pipeline, including reading data, applying exclusions, fitting the model, and writing outputs such as model samples, summaries, and logs.
See also
Other pipeline:
fit_model()
,
format_stan_opts()
Other pipeline:
fit_model()
,
format_stan_opts()