Next-generation sequencing (NGS) workflows are complicated, time-consuming, and expensive. However, there are many opportunities to increase efficiency, optimize steps, and improve data quality. This article provides some tips and tricks to help avoid the pitfalls of low-quality sequencing libraries, uninformative reads, and uneven coverage in target enrichment NGS, and suggests some other tactics to help conserve sequencing resources.

Start with the end in mind: How much sequencing depth do you need?

Answering different research questions can require different amounts of sequencing coverage—that is, the number of times known reference bases will be sequenced. While characterizing germline mutations, indels, and rearrangements may require a depth of only ~20X or even less, identifying somatic mutations or low-frequency SNPs can require at least 100X coverage, and some medical research questions require much deeper coverage.

To avoid wasting sequencing resources, it is important to achieve the desired coverage depth on the first sequencing run; otherwise, parts of the workflow must be repeated—and whether those steps are library preparation, sequencing, or target enrichment, this will inevitably drive up costs. Alternatively, if only ~20X sequencing is needed, sequencing to greater depth is unnecessary.

Assess the quality and quantity of input DNA with a qPCR-based method

One way of avoiding wasted time and resources is to evaluate the quality and quantity of the input DNA. qPCR-based quantification methods are the most accurate since they detect only DNA molecules that are viable templates for adapter ligation and library prep, and they also provide information about the size distribution of those molecules. Accurate assessment can improve the quality of sequencing data by increasing library yield, reducing duplication rates, improving sequence coverage, and preserving library diversity, and is especially valuable when working with damaged or degraded samples (e.g. FFPE).

Importantly, accurate QC of input DNA enables the optimization of library prep workflows to maximize the amount of input converted into library molecules. Knowing the amount of viable DNA going into library prep is also useful for calculating the theoretical depth of coverage that can be obtained for each sample. For example, if the final library contains fewer genome equivalents than the coverage depth required, no amount of additional library amplification and/or sequencing will provide enough unique reads to reach the desired coverage depth; realizing this early on can avoid expensive re-sequencing of suboptimal samples.

Use high-quality library prep reagents and well-designed probes

Efficient library preparation requires a workflow that is compatible with the type of input being used (e.g. FFPE, low-quality, low-input, cfDNA); has a high conversion rate (the percentage of input DNA molecules that successfully ligate to sequencing adapters); and that preserves the complexity of the input sample.

Other important factors to consider, especially in target enrichment workflows, are GC bias and coverage uniformity. The amount of GC bias in target-enriched sequencing data can be highly variable, resulting in extremely low coverage of some regions (requiring additional sequencing) and over-sequencing of others (resulting in wasted reads). In other cases, some target regions may be missed entirely (target dropout); when this occurs, the target-enriched libraries must be re-created as additional sequencing will not provide data for these regions. Using well-designed target enrichment probes produced by a reliable manufacturer can reduce these problems. Additionally, high-quality probes can enable shorter hybridization times, reduced workflow length, and increased sample throughput without compromising data quality.

Select an appropriate multiplexing scheme

Sample multiplexing (pooling samples together) can increase the number of samples captured in a single target enrichment capture reaction or loaded onto a single sequencing run, saving time and reagents and reducing direct sequencing costs. In target enrichment workflows, samples can be pooled pre- or post-capture, although it is essential in both cases to ensure that the adapters used for each library are compatible with the planned multiplexing scheme.

Pre-capture multiplexing (combining samples prior to the hybridization step) requires less capture reagents and reduces the handling of individual samples, potentially reducing costly handling errors. Post-capture multiplexing (combining final libraries prior to sequencing) reduces sequencing costs while keeping options open for re-running specific samples, or for adjusting the relative concentration of libraries on the sequencer if greater or lesser sequencing depth is needed for particular samples.

Accurately quantify final libraries and library pools prior to loading the sequencer

Library quantification via qPCR-based methods enables accurate sample pooling and optimal clustering on the sequencing flow cell. In contrast, methods such as fluorometric or electropherogram-based approaches may detect molecules that are not sequenceable (e.g. molecules lacking the p5 or p7 adapter) or may not count all sequencing-competent molecules (e.g. single-stranded library molecules). Thus, non-qPCR-based library quantification can lead to either under-clustering on the sequencing flow cell, wasting sequencing real-estate, or over-clustering, leading to unusable reads.

Roche offers an extensive selection of products and resources to maximize the efficiency of NGS workflows. To learn more, visit sequencing.roche.com/en-us.html