Whole Genome Sequencing

Overview

Obtaining a comprehensive picture of an organism’s genome

The aim of whole-genome sequencing (WGS) is to determine an organism’s complete DNA sequence in a single experiment, including a comprehensive picture of both the coding and non-coding regions.  As such, WGS provides a comprehensive picture of both the coding and noncoding regions of chromosomal and mitochondrial DNA, as well as chloroplast DNA (in plants). WGS enables the detection of all types of genetic variation, including single-nucleotide polymorphisms (SNPs), small insertions and deletions (indels), and structural variants, such as translocations and copy number variation (CNV)1

The genome of bacteriophage ɸX174 (5,386 bp) was the first genome to be fully sequenced, by Fred Sanger and colleagues in 1977.2 In the 14 years that followed, the Sanger method was used to sequence small genomes, such as those of bacteriophages and viruses (all in the 50 – 200 kb range); as well as the first genome of a free-living organism (Haemophilus influenza, 1.8 Mb; published in 19953). Sanger sequencing was also used to sequence the first plant genome (Arabidopis thaliana, 135 Mb; published in 20004) and the first draft of the human genome (published in 20015). The advent of next-generation sequencing (NGS) made sequencing of the first human cancer genome possible (published in 20086). Continuous improvements in NGS technology (and concomitant reductions in per-base cost) have since enabled routine, high-throughput WGS of both simple and highly complex genomes.

Among other applications, WGS research enables us to:

  • gain deeper insight into the genomic basis of health, disease and ancestry than what is possible with targeted sequencing approaches
  • discover biomarkers and understand pharmacogenetics
  • perform genome-level comparative analysis, to identify synteny, orthologs and horizontal gene transfer events 
  • generate reference genomes for agriculturally important animals and plant, to assist with breeding
  • support ecology and conservation biology
  • understand disease outbreaks and public health
  • secure food safety
  • understand antibiotic resistance
  • study microbiomes and their role in human health and disease

 

Sample Prep for WGS

As is the case for all NGS applications, sample prep constitutes the first step in the WGS workflow, and holds the key to unlocking the potential of every sample. Because NGS samples are precious, the best sample prep solutions are needed to process more samples successfully, get more information from every sample and optimize your sequencing resources. Roche Sample Prep Solutions offer an integrated approach to sample preparation, addressing all of the steps required to convert a sample to a sequencing-ready library. From sample collection to library quantification, we offer sample prep solutions for different sample types and sequencing applications that are proven, simple and complete.

Library construction for WGS starts with fragmenting DNA to the appropriate size, after which platform-specific adapters are added. PCR-free workflows are preferred for WGS, but in cases where input DNA is limited or is of poor quality, library amplification is required. WGS library construction protocols typically include a size-selection step as a narrow library fragment distribution facilitates data analysis. Quantification and QC of sequencing-ready libraries are important to ensure optimal clonal amplification on NGS platforms. After sequencing, sequence reads are aligned against a reference genome (reference-guided sequence assembly), or when no such reference is available, compared to each other and assembled into long contiguous segments (de novo sequencing). This general workflow applies to the sequencing of both simple (e.g. bacterial) and complex (e.g. human) genomes, but these applications pose very different challenges. 

Learn more about sample prep for human WGS.

Learn more about sample prep for bacterial WGS.

 

References:

  1. Smedley D, Schubach M, Jacobsen JOB et al. A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease. Am. J. Human Genet. 2016; 99:595. doi:10.1016/j.ajhg.2016.07.005.
  2. Sanger F, Air GM, Barrell BG et al. Nucleotide sequence of bacteriophage φX174 DNA. Nature. 1977;265(5596):687. doi:10.1038/265687a0.
  3. Fleischmann RD, Adams MD, White O, et al. Whole-Genome Random Sequencing and Assembly of Haemophilus influenzae Rd. Science. 1995; 269:496. doi: 10.1126/science.7542800.
  4. Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408: 796. doi:10.1038/35048692
  5. Venter JC, Adams MD, Myers EW,  et al. The Sequence of the Human Genome. Science. 2001; 291:1304. doi:10.1126/science.105804.
  6. Ley TJ, Mardis ER, Ding L et al. DNA sequencing of a cytogenetically normal acute myeloid leukemia genome. Nature. 2008;456(7218): 66. doi:10.1038/nature07485.