Despite the plummeting costs of next-generation sequencing (NGS), there remain many instances in which target enrichment before sequencing is a very good idea. Besides the obvious efficiency and cost-saving on sequencing and subsequent analysis, focusing on just the areas of interest allows for far more depth from far fewer reads. It affords the sensitivity to pick up low-abundance material, such as minimal residual disease from liquid biopsy, the SARS-CoV-2 viral genome, or low-frequency variants involved in cancers and polygenic diseases.

Amplicon-based and hybridization-based workflows are the two principal methods researchers use to enrich targets prior to NGS. Here we look at these with an eye toward providing expert advice on which to choose when, as well as tips to avoiding pitfalls and getting the most out of the endeavor.

Amplification and hybridization head-to-head

The amplicon-based approach is basically PCR amplification of certain genetic regions, resulting in large numbers of copies of those regions in relation to the non-amplified rest of the genome. It’s quick and easy, and works with relatively small amounts of DNA. It “requires some prior knowledge about the structure and sequence of the genome and the region of interest, and is really only suitable for a small number of regions of interest”—a limited number of biomarkers—says Jon Badalamenti, a researcher at the University of Minnesota Genomic Center.

Search DNA sequencing
Search Now Search our directory to find DNA sequencing products.

And “because it amplifies the DNA between two sequences dictated by the specific primers designed to demarcate at each end of the amplicon, it can fail to amplify if variation under the primer sequences renders them ineffective,” as has been seen with variants of the SARS-CoV-2, points out Steven Henck, Vice President, R&D, of Integrated DNA Technologies (IDT). “However, IDT’s novel amplicon chemistry enables the use of overlapping primers to mitigate these problems.”

With a hybridization approach, complementary nucleic acid probes are used to pull down regions of interest. Here the biotin-labeled probes are mixed with genomic DNA, heated for denaturation, and allowed to cool and hybridize, explains Siyuan Chen, Chief Technology Officer of Twist Bioscience. The probe-genomic DNA hybrids are then captured on Streptavidin-coated magnetic beads.

Choosing between Amplicon-based and Hybridization-based Workflows

  • Pros of Amplicon-based: Quick and easy. Needs less sample
  • Cons of Amplicon-based: Primers need to play well together
  • Pros of Hybridization-based: More tolerant of underlying variation. Nearly unlimited ROIs
  • Cons of Hybridization-based: Long, complicated workflow

In addition to being able to target a virtually unlimited number of regions, a “hybridization-based approach can tolerate mismatches better than an amplicon-based approach, which is particularly important when targeting highly variable regions such as viral genomes,” Chen adds. Its principle downside is its long, demanding, labor-intensive workflow.

Should I enrich?

But is target enrichment even always called for? “We’ve certainly had times where we price out the cost of hybrid capture on six samples, and it’s actually cheaper to sequence the whole genome on the six samples,” points out Badalamenti.

This is due, in part, to the economics of targeting and sequencing. Over and above the investments of time, money, and/or labor necessary to enrich a sample for sequencing, it may be necessary to order a minimum number of targeting reactions.

Another consideration is how small a region of the genome is being targeted. Given the cost of reagents, it doesn’t make sense to target 10% of the genome, Badalamenti says. But if you’re looking at 50 kB out of the human genome, “you probably do want to target it.”

Fixed and custom panels

Once the decision is made to enrich samples prior to sequencing, the next step is to see whether an off-the-shelf (fixed) panel will meet your needs. For enough regions of interest, it may pay to use an exome panel to capture the entire coding region (about 1% of the genome), which “could be more cost effective if you take away all the time and effort, energy, that is typically required to design and optimize a large panel,” says Zach Herbert, Director of the Molecular Biology Core Facilities at Dana-Farber Cancer Institute.

Many companies also offer pre-designed and validated panels for specific pathways or disease states. Often, though, such panels don’t include a researcher’s one or two pet genes that are of interest to them and their project. “Which is why some companies will allow you to do a spike-in,” Herbert notes. “You can design some custom probes that you spike into a standard panel. That's a nice option.”

Sometimes a fixed panel just isn’t an option. But designing a custom panel can be tough, with many pitfalls to be aware of. For example, “particularly for amplicon-based approaches, you have to be a little more thoughtful about primers, including making sure they play well together, have limited overlap, and do not bind to regions of high variability,” explains Henck.

Chen warns of “context-specific biases” caused by extreme GC content, secondary structure, or repetitive sequences, for example, leading to non-uniform sequencing coverage. “One of the ways that Twist overcomes challenges in target enrichment is ensuring that each probe is correctly ‘tuned’ to capture the desired sequences within the sample in a uniform way” —for example by smartly placing probes, and by adjusting the relative probe concentration to ensure high capture efficiency.

If it’s time to design a custom panel, Badalamenti recommends not “white-knuckling it,” but instead taking advantage of vendors’ ability to help. “They have software and trained scientists whom they pay to review those panels to make sure there are no potential pitfalls.”

Pooling

Many more recent hybrid capture approaches allow pooling of up to 12 barcoded sample libraries prior to enrichment. “That’s great, and it saves a lot of money, because capture reagents are expensive,” Herbert remarks. “But one of the challenges there is, if you don’t do a really good job pooling them before the capture, then you get unbalanced coverage of those libraries after.”

To address the challenge, Herbert is very careful and thorough with library QC, quantitating, measuring the concentration, and looking at the size distribution. “If everything goes well, libraries have a relatively narrow size distribution, which tend to be easier to pool.” To accomplish the latter, he recommends shearing the DNA before making the library.

Herbert also recommends using a “healthy amount”—100–200 ng—of DNA when making a library for exome sequencing. Otherwise, there may not be good representation going into the capture, and “as a result you’ll end up with a potentially unbalanced pool that you can’t do anything about.”

Other ways to enrich

Badalamenti adds a third workflow “which is in a kind of beta phase, and is more vendor-specific: using CRISPR/Cas9 to do targeted enrichment using a guide RNA.” Here, unwanted sequences are depleted by site-specific cleavage, resulting in an increase in comparatively rare species.

He also mentions that “you can do kind of an in silico enrichment” using a new technique called “adaptive sampling” on the Oxford Nanopore electro-chemical sequencing platform. “You tell the instrument what regions you don’t want to sample, and have the instrument reject DNA from those regions on the fly.”

From traditional PCR to cutting-edge workflows, target enrichment can allow researchers to better focus sequencing efforts and dollars on their regions of interest.