Outputs¶
Pipeline Overview:¶
The workflow will generate outputs in the following order:
- Validation
- Responsible for QC of metadata
- Aligns sample metadata .xlsx to sample .fasta
- Formats metadata into .tsv format
- Annotation
- Extracts features from .gff
- Aligns features
- Annotates sample genomes outputting .gff
- Submission
- Formats for database submission
- This section runs twice, with the second run occurring after a wait time to allow for all samples to be uploaded to NCBI.
Output Directory Formatting:¶
The outputs are recorded in the directory specified within the nextflow.config file and will contain the following:
- validation_outputs (name configurable with
validation_outdir
)- name of metadata sample file
- errors
- fasta
- tsv_per_sample
- name of metadata sample file
- liftoff_outputs (name configurable with
final_liftoff_outdir
)- name of metadata sample file
- errors
- fasta
- liftoff
- tbl
- name of metadata sample file
- vadr_outputs (name configurable with
vadr_outdir
)- name of metadata sample file
- errors
- fasta
- gffs
- tbl
- bakta_outputs (name configurable with
bakta_outdir
)- name of metadata sample file
- fasta
- gff
- tbl
- submission_outputs (name and path configurable with
submission_outdir
)- individual_sample_batch_folder
- biosample
- sra
- genbank
- log_file
- individual_sample_batch_folder
- final_submission_outputs (name and path configurable with
final_submission_outdir
)- updated_metadata_Excel_file
- submission_report_file
Understanding Pipeline Outputs:¶
The pipeline outputs include:
- batch_
.tsv files for each sample (one for each sample batch) - separate fasta files for each sample
- separate gff files for each sample
- separate tbl files containing feature information for each sample
- submission log files
- This output is found in the submission_outputs file in your specified output_directory
Last update: 2025-09-12