NGS QC Knowledge Check

Test your understanding of NGS quality control concepts. Select an answer for each question and click Check to see if you’re correct. Your total score will appear at the end after all questions are answered.

Answered: 0 / 15

1. Why is Quality Control important in Next-Generation Sequencing?

a. NGS is expensive, so it's good to ensure your data is useful in global analyses b. Erroneous sequencing data can disrupt phylogenetic analyses c. High quality data is needed to confidently identify novel mutations d. All of the above

All of these are valid reasons. QC ensures that expensive sequencing runs produce reliable data, prevents erroneous bases from distorting phylogenetic trees, and provides the confidence needed to call novel mutations accurately.

2. What does Phred score measure?

a. Read length b. Alignment quality c. Base quality d. Genome completeness e. The number of reads aligned

Phred scores represent the probability of an incorrect base call. A Phred score of 30 means a 1 in 1,000 chance of error (99.9% accuracy). Higher scores indicate more confident base calls.

3. What does coverage depth measure?

a. Read length b. The number of reads aligned to a given position c. Genome completeness d. Base quality e. Alignment quality

Coverage depth (or read depth) measures how many sequencing reads align to (cover) a given position in the reference genome. Higher depth provides more confidence in base calls and variant detection.

4. What metadata is mandatory for public database submission?

a. Clade and Collection Date b. Collection date and location c. Patient age, sex, and location d. There is no mandatory metadata

Public databases like GISAID require collection date and geographic location as mandatory metadata. Patient-level data (age, sex) is encouraged but not required, and clade is typically assigned by the database itself.

5. What kinds of files are viewed in IGV (Integrative Genomics Viewer)?

a. .bam files b. .fastq files c. .fasta files d. .vcf files

IGV is primarily used to visualize .bam files (aligned reads). BAM files contain read alignments mapped to a reference genome, allowing you to inspect coverage, variants, and read quality at specific genomic positions.

6. True or false: if there is amplicon drop-out in the laboratory, you can use complex bioinformatics methods to infer the missing read data from your sample.

a. True b. False

False. Missing read data should be assembled as missing (N's), NOT reference-filled. Bioinformatics methods cannot invent data that was not sequenced. Reference-filling creates false consensus sequences that can mislead downstream analyses and introduce artificial similarity to the reference.

7. What kinds of influenza samples should be sequenced?

a. Random sampling, Ct < 28 b. Outbreak samples, Ct < 32 c. Any influenza-like-illness samples

For routine surveillance, random sampling with a Ct value below 28 ensures sufficient viral genetic material for high-quality sequencing. Higher Ct values (lower viral load) often result in incomplete genomes with poor coverage.

8. Why does MIRA check the total number of minor variants as a QC metric?

a. To track super-flu mutations b. To detect reassortants c. To detect contamination/co-infection d. To track vaccine efficacy

An unusually high number of minor variants across multiple segments is a strong indicator of contamination or co-infection — meaning reads from two different viral populations are mixed in the same sample. This is flagged as a QC warning.

9. Which BLAST database would be more complete for Influenza samples?

a. NCBI b. GISAID

GISAID contains a more comprehensive collection of influenza sequences because many submitters share data exclusively through GISAID before (or instead of) depositing in NCBI GenBank. This makes GISAID the preferred database for influenza BLAST searches.

10. An Influenza B sample was collected from a human in France in December 2025 and was sequenced in January 2026. Its identifier is sample number A123. How should it be named?

a. A/France/A123/2025 b. B/France/A123/2025 c. B/human/France/A123/2026 d. B/human/France/A123/2025

The correct influenza strain naming convention is: Type/Location/Identifier/Collection Year. For human samples, the host is omitted from the strain name. Since this is Influenza B collected from a human, it is B/France/A123/2025. The collection year (2025) is used, not the sequencing year (2026). The host field is only included for non-human isolates (e.g., swine, avian).

11. In your lab, you receive an Influenza sample that tests positive on RT-PCR for both Influenza A H3 and Influenza A H1. What would this sample be considered?

a. Co-infection b. Contamination c. Reassortant

Co-infection. The patient is infected with two different Influenza A subtypes simultaneously (H3 and H1). This is detected at the RT-PCR stage before sequencing. Do not proceed with sequencing this sample — mixed populations will produce uninterpretable assemblies.

12. In your lab, you receive an Influenza sample that tests positive on RT-PCR for Influenza A H3. Following NGS, it assembles as H3N1 with a high minor variant count in all internal segments. What would this sample be considered?

a. Co-infection b. Contamination c. Reassortant

Contamination. The RT-PCR only detected H3, but the assembly shows a mismatched neuraminidase (N1 instead of the expected N2) along with high minor variants across all internal segments. This pattern indicates laboratory contamination — reads from a different sample (likely H1N1) were mixed in during library preparation or sequencing. Do not submit this genome to GISAID!

13. In your lab, you receive an Influenza sample that tests positive on RT-PCR for Influenza A H3. Following NGS, it assembles as H3N1 with a small number of minor variants. All QC metrics pass in MIRA. You BLAST each segment and all are H3N2-like except for the NA segment, which is H1N1-like. You re-sequence the sample and get the same results. What would this sample be considered?

a. Co-infection b. Contamination c. Reassortant

Reassortant! Key indicators: (1) QC metrics pass with low minor variants (clean assembly from a single population), (2) only ONE segment is mismatched (NA is H1N1-like while all others are H3N2-like), and (3) re-sequencing reproduces the same result. This virus acquired its NA segment from an H1N1 lineage through reassortment — a biologically real event worth reporting.

14. What kind of pre-assembly QC does the MIRA pipeline perform for you?

a. Primer trimming b. Quality filtering c. Read length filtering d. Optional running of FastQC and MultiQC e. All of the above

MIRA performs all of these pre-assembly QC steps: it trims primer sequences, filters reads by quality score, removes reads that are too short, and can optionally generate FastQC/MultiQC reports for manual review of raw read quality.

15. What kind of post-assembly QC does the MIRA pipeline perform for you?

a. Contamination checking b. Coverage depth thresholds c. Completeness thresholds d. Premature stop codons e. All of the above

MIRA performs all of these post-assembly QC checks: it screens for contamination via minor variant counts, enforces minimum coverage depth thresholds, checks genome completeness, and scans coding regions for premature stop codons that may indicate assembly errors or pseudogenes.

NGS QC Knowledge Check

NGS QC Knowledge Check

1. Why is Quality Control important in Next-Generation Sequencing?

2. What does Phred score measure?

3. What does coverage depth measure?

4. What metadata is mandatory for public database submission?

5. What kinds of files are viewed in IGV (Integrative Genomics Viewer)?

6. True or false: if there is amplicon drop-out in the laboratory, you can use complex bioinformatics methods to infer the missing read data from your sample.

7. What kinds of influenza samples should be sequenced?

8. Why does MIRA check the total number of minor variants as a QC metric?

9. Which BLAST database would be more complete for Influenza samples?

10. An Influenza B sample was collected from a human in France in December 2025 and was sequenced in January 2026. Its identifier is sample number A123. How should it be named?

11. In your lab, you receive an Influenza sample that tests positive on RT-PCR for both Influenza A H3 and Influenza A H1. What would this sample be considered?

12. In your lab, you receive an Influenza sample that tests positive on RT-PCR for Influenza A H3. Following NGS, it assembles as H3N1 with a high minor variant count in all internal segments. What would this sample be considered?

14. What kind of pre-assembly QC does the MIRA pipeline perform for you?

15. What kind of post-assembly QC does the MIRA pipeline perform for you?

🎉 Quiz Complete!