Deciphering the mysteries of human genomes — Anna Přistoupilová

Explore how DNA sequencing and bioinformatics help diagnose rare genetic diseases, from processing raw genomic data to identifying disease-causing variants and clinical impact.

Key takeaways
  • The human genome contains 3 billion letters (base pairs) with only 2% being coding DNA (genes), while the rest is non-coding DNA with regulatory functions

  • Rare diseases collectively affect up to 6% of the population, with over 6,000 different types identified:

    • 72% are genetic in origin
    • 70% start in childhood
    • Only 5% have available treatments
    • Can take 15-17 years for research findings to reach clinical practice
  • DNA sequencing technologies have evolved significantly:

    • First human genome (2001): 13 years, $3 billion
    • Current cost: ~$500 per genome
    • Next-generation sequencing improved clinical diagnosis rates from 1% to 50%
  • Key sequencing approaches:

    • Whole genome sequencing (entire DNA)
    • Exome sequencing (2% coding regions only)
    • Targeted sequencing (specific genes)
    • Nanopore sequencing (long reads)
  • Bioinformatics workflow includes:

    • Raw sequence data processing (FASTQ format)
    • Quality control and filtering
    • Mapping to reference genome
    • Variant identification and annotation
    • Data analysis using tools like Python, BioPython, and BioConda
  • Genetic variants can have different effects:

    • Silent mutations (no effect)
    • Missense mutations (amino acid change)
    • Nonsense mutations (protein termination)
    • Insertions/deletions (reading frame shifts)
  • When disease-causing variants are found:

    • Validate findings
    • Check family members
    • Find other affected families
    • Consider treatment options
    • Provide genetic counseling
    • Submit to databases
    • Publish findings