Target enrichment (TE) during library preparation increases the relative signal to noise ratio of next-generation sequencing (NGS). Amplicon-based TE achieves this by amplifying the signal, while hybridization-based TE accomplishes its task by fishing signal out from the sea of background noise. The result is the same (to mix metaphors): more of the needles you are looking for in a much smaller haystack.

Why enrich?

The advent of NGS, with its ability to sequence potentially millions of DNA templates in parallel, has opened up a new world of genetic and genomic inquiry. Questions that were until fairly recently barely even imaginable can now be comprehensibly answered. Yet while whole genome sequencing (WGS) is the gold standard for unbiased discovery anywhere in the genome, it may not offer the precision nor the depth needed to decipher the roles of individual genes in complex diseases, nor allow for insights into rare and low-frequency genetic variants. WGS is time-consuming, costly, and data-intensive, often requiring specialized expertise for interpretation.

Targeting and enriching for specific regions—whether they be exons, genes, or other regions of interest (ROIs)—prior to sequencing allows for a more efficient, cost-effective means of garnering the desired data, often requiring less input sample, and allowing for greater sequencing depth, increased sensitivity, and far simpler bioinformatic analysis than WGS.

There are two principal methods commonly used to enrich for targets, amplicon-based and hybridization-based, each with their own distinct advantages and disadvantages.

Amplicon-based TE

Amplicon-based TE is essentially a type of highly multiplexed PCR of fragmented genomic DNA during which NGS indexes and adapters are added in one of the amplification steps. Sequence-specific primers are designed to flank ROIs, allowing those regions to be selectively amplified. Library preparation and target enrichment can be performed simultaneously, or the amplicons resulting from the initial PCR can be pooled and used to create a target-enriched library for sequencing.

The method is relatively quick, simple, and low cost, and requires only very small amounts of starting material. Because of the specificity of the PCR primer design, carefully designed amplicon-based TE can result in a high on-target sequencing read percentage.

Hybridization-based TE

Hybridization-based (or capture-based) TE utilizes single-stranded DNA or RNA oligonucleotides (typically free in solution) as baits to capture homologous ROIs. The double-stranded complexes are then immobilized and washed to remove unbound, nonspecific molecules. Then, the bound DNA, which represents only the targeted ROIs, is released from the baits.

Some protocols first create a sequencing library from randomly sheared denatured genomic DNA prior to capturing the target library molecules. Others begin by first capturing the genomic DNA and then creating a library from the captured sequences.

Among the advantages of hybridization-based TE is that it is highly sensitive, so it’s less likely to miss mutations, which is particularly beneficial for variant calling. It performs well with respect to sequencing complexity, and offers good uniformity of coverage. It can cover a wide expanse of genomic regions in a single experiment. And it can be used to capture a nearly unlimited number of targets simply by adding more probes.

Compare and contrast: What to consider

The popularity of whole exome sequencing (WES)—targeting the roughly 1.5% of the genome that codes for proteins, thus vastly reducing the amount of genetic material to be sequenced—is a testament to the power of target enrichment. But because many sequences remain that may be irrelevant to the question at hand, oftentimes even WES leads to far more sequencing, and (given a finite amount of resources) consequently far shallower sequencing, than is needed. While both amplicon-based and hybridization-based TE will significantly reduce the sequencing burden and allow resources to be concentrated on ROIs, there are several factors to consider.

How much time and energy do you plan to invest? What about budget? How much sample is available to use? Compared to amplicon-based enrichment protocols, hybridization-based approaches require more sample, more time, more steps, and may be more expensive per-sample to run as well.

Yet applications like genotyping and detection of rare variants necessitate greater sensitivity. Hybridization captured libraries can detect mutations present at levels down to about 1%, compared to about 5% for amplification-based TE libraries’ sensitivity. When the probes are well-designed and of high quality, hybridization libraries are also more uniform, with PCR-based amplicon libraries suffering more from PCR bias.

Sensitivity comes at the cost of specificity, though. Amplicon-based enrichment is able to deliver a higher percentage of on-target reads.

There is also a limit to how many targets can be multiplexed in an amplicon-based enrichment. That number is scalable and virtually unlimited for a hybridization -based protocol, making it well suited for larger probe panels.

More information can be found here.

About the Author

Josh P. Roberts has an M.A. in the history and philosophy of science, and he also went through the Ph.D. program in molecular, cellular, developmental biology, and genetics at the University of Minnesota, with dissertation research in ocular immunology.