Parameters
Default parameters are given in the nextflow.config file. This table lists the parameters that can be changed to a value, path or true/false. When changing these parameters pay attention to the required inputs and make sure that paths line-up and values are within range. To change a parameter you may change with a flag after the nextflow command or change them within your nextflow.config file.
- Please note the correct formatting and the default calculation of submission_wait_time at the bottom of the params table.
Param | Description | Input Required |
--ref_fasta_path | Reference Sequence file path | Yes (path as string) |
--meta_path | Meta-data file path for samples | Yes (path as string) |
--ref_gff_path | Reference gff file path for annotation | Yes (path as string) |
General Subworkflow
Param | Description | Input Required |
--submission | Toggle for running submission | Yes (true/false as bool) |
--annotation | Toggle for running annotation | Yes (true/false as bool) |
--dry_run | Simulate submission and print a log. | No (true/false) |
--workflow | Specifies the workflow to execute, allowing users to choose the appropriate processing method. | Yes (string) |
Workflow Options
The following workflows are available for the --workflow
parameter:
- biosample_and_sra: Runs a submission to BioSample and SRA.
- genbank: Runs a GenBank submission.
- fetch_accessions: Fetches reports and updates the metadata file.
- full_submission: Executes BioSample and SRA submissions, waits 60 seconds multiplied by
params.batch_size
, fetches reports, updates the metadata file with accession IDs, and then performs the GenBank submission. - update_submission: Executes a BioSample submission using an updated metadata Excel file.
Note: The GenBank submission cannot complete without a BioSample accession ID.
General Settings
Param | Description | Input Required |
--date_format_flag | Flag to specify the date format. Options: s (default, YYYY-MM), v (verbose, YYYY-MM-DD), o (original, unchanged) | Yes (string) |
--publish_dir_mode | Mode for publishing directory, e.g., 'copy' or 'move' | Yes (string) |
--remove_demographic_info | Flag to remove demographic info. If true, values in host_sex, host_age, race, ethnicity are set to 'Not Provided' | Yes (true/false) |
--batch_size | The number of samples to prepare in one submission file. | No (integer) |
General Output
Param | Description | Input Required |
--outdir | File path to submit outputs from pipeline | Yes (path as string) |
--overwrite_output | Toggle to overwriting output files in directory | Yes (true/false as bool) |
--final_submission_outdir | Either name or relative/absolute path for the final outputs from submission report fetching | No (string or path) |
Validation
Param | Description | Input Required |
--validation_outdir | File path for outputs specific to validate sub-workflow | Yes (folder name as string) |
--validate_custom_fields | Toggle checks/transformations for custom metadata fields on/off | No (true/false as bool) |
--custom_fields_file | Path to the JSON file containing custom metadata fields and their information | No (path as string) |
Liftoff
Param | Description | Input Required |
--final_liftoff_outdir | File path to liftoff specific sub-workflow outputs | Yes (folder name as string) |
--lift_print_version_exit | Print version and exit the program | Yes (true/false) |
--lift_print_help_exit | Print help and exit the program | Yes (true/false) |
--lift_parallel_processes | Number of parallel processes to use for liftoff | Yes (integer) |
--lift_child_feature_align_threshold | Map only if its child features align with sequence identity greater than this value | Yes (float) |
--lift_unmapped_features_file_name | Name of unmapped features file | Yes (path as string) |
--lift_copy_threshold | Minimum sequence identity in exons/CDS for which a gene is considered a copy; default is 1.0 | Yes (float) |
--lift_distance_scaling_factor | Distance scaling factor; default is 2.0 | Yes (float) |
--lift_flank | Amount of flanking sequence to align as a fraction of gene length | Yes (float between 0.0 and 1.0) |
--lift_overlap | Maximum fraction of overlap allowed by two features | Yes (float between 0.0 and 1.0) |
--lift_mismatch | Mismatch penalty in exons when finding best mapping; default is 2 | Yes (integer) |
--lift_gap_open | Gap open penalty in exons when finding best mapping; default is 2 | Yes (integer) |
--lift_gap_extend | Gap extend penalty in exons when finding best mapping; default is 1 | Yes (integer) |
--lift_minimap_path | Path to minimap if you did not use conda or pip | Yes (N/A or path as string) |
--lift_feature_database_name | Name of the feature database, if none, will use ref gff path to construct one | Yes (N/A or name as string) |
--lift_feature_types | Path to the file containing feature types | Yes (path as string) |
--lift_coverage_threshold | Minimum coverage threshold for feature mapping | Yes (float) |
--repeatmasker_liftoff | Flag to enable or disable RepeatMasker and Liftoff steps | Yes (true/false) |
VADR
Param | Description | Input Required |
--vadr | Toggle for running VADR annotation | Yes (true/false as bool) |
--vadr_outdir | File path to vadr specific sub-workflow outputs | Yes (folder name as string) |
--vadr_models_dir | File path to models for MPXV used by VADR annotation | Yes (folder name as string) |
BAKTA
Controlling Bakta within TOSTADAS uses parameters of the same name with prefix --bakta_
. For more details, visit the Bakta GitHub page.
Param | Description | Input Required |
--bakta | Toggle for running Bakta annotation | Yes (true/false as bool) |
--bakta_db_path | Path to Bakta database if user is supplying database | No (path to database) |
--download_bakta_db | Option to download Bakta database | Yes (true/false) |
--bakta_db_type | Bakta database type (light or full) | Yes (string) |
--bakta_outdir | File path to bakta specific sub-workflow outputs | Yes (folder name as string) |
--bakta_min_contig_length | Minimum contig size | Yes (integer) |
--bakta_threads | Number of threads to use while running annotation | Yes (integer) |
--bakta_genus | Organism genus name | Yes (N/A or name as string) |
--bakta_species | Organism species name | Yes (N/A or name as string) |
--bakta_strain | Organism strain name | Yes (N/A or name as string) |
--bakta_plasmid | Name of plasmid | Yes (unnamed or name as string) |
--bakta_locus | Locus prefix | Yes (contig or name as string) |
--bakta_locus_tag | Locus tag prefix | Yes (autogenerated or name as string) |
--bakta_translation_table | Translation table | Yes (integer) |
--bakta_gram | Gram type for signal peptide predictions | No ('+' '-' '?') |
--save_reference | Option to save the downloaded Bakta database | No (true/false) |
Sample Submission
Param | Description | Input Required |
--biosample | Submit to BioSample | Yes (true/false as bool) |
--sra | Submit to SRA | Yes (true/false as bool) |
--submission_outdir | Either name or relative/absolute path for the outputs from submission | Yes (name or path as string) |
--final_submission_outdir | Either name or relative/absolute path for the final outputs from submission report fetching | No (string or path) |
--prod_submission | Whether to submit samples for test or actual production | Yes (prod or test as string) |
--submission_config | Configuration file for submission to public repos | Yes (path as string) |
--submission_wait_time | Calculated based on sample number (3 * 60 secs * sample_num) | integer (seconds) |
--send_submission_email | Toggle email notification on/off | Yes (true/false as bool) |
--submission_mode | Mode of submission | Yes (string) |
Update Submission
| --original_submission_outdir | Either name or relative/absolute path for the outputs from original submission (the one being updated) | Yes (name or path as string) |
❗ Important note about send_submission_email
: An email is only triggered if Genbank is being submitted to AND table2asn
is the genbank_submission_type
. As for the recipient, this must be specified within your submission config file under 'general' as notif_email_recipient
.
Last update: 2025-09-12