Whole-genome sequencing (WGS) provides access to the wealth of information contained in an organism’s complete DNA sequence—including a comprehensive picture of both the coding and non-coding regions—from a single sample in a single experiment. Changes in the enabling technologies have led to significant reductions in per-base sequencing costs, thus driving interest and investment in WGS for discovery of new biomarkers and drug targets, and to advance our understanding of health and disease, as well as fitness and longevity, of humans and other organisms.

To maximize the effectiveness and efficiency of WGS several best practices should be followed.

Start with high molecular-weight DNA

The highest-throughput sequencing instruments require libraries with inserts between 350 and 650 bp for WGS. To achieve this, it’s best to begin by extracting DNA from whole blood or a suitable equivalent source using a protocol or an instrument that is known to yield high-quality, high molecular weight DNA. From here, the DNA can be hewn into fragments, and size-selected, to be of optimal size for the specific workflow.

Shearing

A fragmentation process is typically used to generate DNA pieces of the appropriate size for WGS library construction. Non-mechanical methods, such as enzymatic digestion, traditionally have introduced a potential bias into the resulting library. Despite recent advances in non-mechanical methods, mechanical methods are routinely used for nearly all high-throughput WGS library construction.

whole-genome sequencing

Mechanical shearing of DNA introduces little or no apparent genomic bias, and engenders no loss of sample although external factors such as shearing volume, water bath temperature, and sample concentration and viscosity need to be taken into account. Suboptimal shearing can be corrected for later with size selection, but it’s best to optimize shearing protocols before precious samples are processed and wasted.

No PCR

Historically WGS required large amounts of sample DNA in order to provide for adequate sequencing coverage of the genome. Samples were often amplified by PCR before library construction. Yet mammalian genomes contain elements that are notoriously difficult to amplify and sequence, including repetitive sequences, regions of extreme (<25% and >75%) GC content, as well as low-complexity regions. Amplification of such regions tends to result in significant GC-bias.



Figure 1. PCR-free library construction workflow

whole-genome sequencing 

More recently PCR-free chemistries have been optimized to achieve higher conversion of input DNA to adapter-ligated library fragments, enabling researchers to use lower amounts of input DNA, and even samples of variable quality, and yet still achieve higher success rates. PCR-free protocols eliminate a source of amplification-associated bias, and result in improved coverage uniformity and higher depth than methods relying on PCR. Thus, PCR-free library prep has become standard for large-scale WGS projects.

Figure 2. GC bias plots for libraries prepared for whole-genome shotgun sequencing of bacteria with extreme GC content.

Flexible and efficient kits

But not all PCR-free chemistries are created equal, and neither are the kits that contain them. Because samples may be of varying quality and concentration, users should look for WGS library-preparation kits capable of efficiently converting a wide range of input DNA to sequenceable fragments without drastic changes in the protocol. Efficient conversion preserves sample complexity, especially in PCR-free workflows, requires less input DNA, and is effective for a broad range of challenging sample types. Kits should also allow for different strategies, which may be tailored to different scenarios, so as to offer even greater flexibility.

whole-genome sequencing

Figure 3. Library quantification via qPCR leads to accurate sample pooling and optimal clustering.

At the same time, you’ll likely want your sample-preparation solutions to be automation-friendly, with streamlined library construction methods that reduce turnaround time and improve reproducibility on production-scale Illumina sequencers. Kits with single-tube prep chemistry offer rapid turnaround times with additional hands-on time savings.

Quantify by qPCR

Accurate quantification of NGS libraries is essential to ensure that libraries are accurately normalized and pooled for multiplexed sequencing, and that libraries or pools can be accurately diluted to the optimal concentration for cluster generation. Sequencing capacity is maximized when sequencing-competent molecules are accurately measured.

qPCR is the preferred method of quantification. qPCR-based library quantification methods detect only and all sequencing-competent molecules, including single-stranded configurations often created in PCR-based workflows. In contrast, standard quantification methods—fluorometry, spectrophotometry, and electrophoresis—measure total nucleic acid concentrations, including fragments that can’t serve as templates. Moreover, because qPCR is extremely sensitive, it allows for the quantification of dilute libraries and consumes very small amounts of library.

Go to https://sequencing.roche.com/en-us/products-solutions/by-application/research/whole-genome-sequencing.html  for further information about whole-genome sequencing.

About the Author

Josh P. Roberts has an M.A. in the history and philosophy of science, and he also went through the Ph.D. program in molecular, cellular, developmental biology, and genetics at the University of Minnesota, with dissertation research in ocular immunology.