Sequence Capture

Editorial Article

Article Tools
  • Email a Colleague
  • Print
  • Comments
  • ShareThis

Wednesday July 14, 2010

by Jeffrey M. Perkel

Ten years after the first draft human genome was announced, genomic science is finally beginning to pay dividends in the healthcare arena.

A few examples:

In November 2009, Richard Lifton of Yale University applied "next-gen" DNA sequencing to correctly diagnose a patient with congenital chloride diarrhea who had originally been thought to possibly suffer from a renal condition such as Bartter syndrome.2

Others have used the technology to pin down gene loci associated with such diseases as Miller syndrome, an autosomal recessive disorder,1 Schinzel-Giedion syndrome, an autosomal dominant disorder,3 and TARP, an X-linked form of cleft palate.4

And just last month, Mary Claire King and colleagues at the University of Washington reported that sequencing technology provided a rapid, cost-effective, and thorough way to screen women for breast and ovarian cancer-associated mutations, a proof-of-principle demonstration the authors say could decrease the cost of genetic testing for these risk factors to about one-sixth of current pricing.5

Yet not one of these studies involved whole-genome sequencing. Instead of sifting through gigabase upon gigabase of hard-to-interpret non-coding sequences, these researchers instead chose to focus the sequencers' power on specific segments of the genome, filtering out extraneous information.

That strategy is called sequence enrichment, and in three of the five studies, researchers used one specific form called exome capture, which selects for that 1% of the human genome, about 30 MB worth, that encodes protein. The rationale for homing in on those 180,000 or so exons is simple: That's where disease-associated mutations are most likely to be found.

"We anticipate that whole-exome sequencing will make broad contributions to understanding the genes and pathways that contribute to rare and common human diseases, as well as clinical practice," Lipton wrote in his study.2

The two other studies employed narrower targeting strategies, in one case limiting their analysis to X chromosome exons,4 and in the other to 21 specific cancer-associated genes, including both coding and noncoding sequences.5

According to Fred Ernani, SureSelect platform business manager at Agilent Technologies, targeting strategies simply make good financial and practical sense. Most researchers, he notes, are neither interested in the entire genome nor able to interpret the non-coding sequence variants they find. Plus, sequencing is still expensive, and for all the data they produce, next-gen sequencing platforms don't have limitless capacity; the more samples a researcher can squeeze into that finite space, the better.

My argument," Ernani says, "is that target enrichment will still be necessary probably over the next five years or more, simply because the whole genome is still three billion bases, and no matter how efficient you make a sequencer, if you can take advantage of a multiplexing product, it's still going to be cheaper to run the sequencing with a target enrichment product … than doing a whole genome."

It's basic math. To sequence both strands of a diploid human genome to 20x coverage requires at least 120 billion bases, compared to "just" 600 million bases to achieve equivalent depth in an exome. In practice, more sequencing is required in either case to overcome enrichment or sequencing biases. Still, with sequencers routinely churning out tens of gigabases of data, researchers can most definitely shoehorn multiple samples per run (an approach called indexing or barcoding).

Agilent's SureSelect, which figured prominently in several of the studies mentioned above, uses hybridization to capture targeted sequences and is available in both solid-phase (microarray) and solution formats. The former option, called SureSelect DNA Capture Arrays, are custom 244,000-element 60-mer DNA microarrays built to customer specifications using the company's eArray software, and a one million feature-array has been rolled out to early-access customers, as well, says Ernani. Either size, he says, will work for customers processing relatively small numbers of samples (up to about 10).

For larger scale work, Ernani recommends solution-based capture, as it is "much more scalable." Arrays simply aren't amenable to automation, he notes, whereas solution-based approaches, which can be processed in 96-well plates and use magnetic capture beads to pull down targeted sequences, are.

Comprised of 120-mer biotinylated RNA capture probes, the SureSelect Target Enrichment System is available in both user-designed and off-the-shelf configurations. Custom content, developed using eArray, can target from under 200 Kb to around 7 MB at present, according to Ernani, and larger regions "in the near future." An off-the-shelf exome capture product (the SureSelect Human All Exon Kit), targets 38 MB of genomic DNA.

Roche NimbleGen and LC Sciences also offer hybridization-based capture tools. LC Sciences launched its custom array-based sample enrichment service early in 2010, according to Chief Technical Officer Xiaochun Zhou. And Roche NimbleGen has been offering array-based capture tools, in both 385,000 and 2.1-million probe formats, for more than two years, according to Roche Applied Science Marketing Manager Mark Repko, including the exome-targeting Sequence Capture 2.1M Human Exome Array. But, like Agilent, both companies now are moving towards a more fluid approach.

Arrays, explains Repko, are ideal for relatively small experiments, and for optimizing probe content via iterative experimentation and redesign. Solution-based technologies, though, benefit from "economies of scale," he says, especially if that scale extends to hundreds, or even thousands, of samples.

"Customers looking to do large projects like the idea of automation, and the economies of scale you get from the manufacturing, allow us to have a better price point in the market," he says.

At LC Sciences, customers have expressed more interested in the firm's custom OligoMix technology, which enables customers to design and purchase pools of up to 30,000 solution probes, than in its array-based service, says Zhou.

Roche NimbleGen recently released its SeqCap EZ Human Exome Library, employing biotinylated DNA oligonucleotides ranging from 55 to 105 nucleotides in length, for both short-read (that is, Illumina's Genome Analyzer) and long-read (Roche's Genome Sequencer FLX) sequencing technologies, and is preparing to launch an updated version with expanded exon target coverage shortly, Repko says.

"We really feel that solution-based capture is the future of sequence capture," concludes Zhou.

But there are other approaches to target enrichment, especially PCR.

Hybridization-based approaches, says Roopom Banerjee, president and CEO of RainDance Technologies, suffer from a number of shortcomings, including relatively low capture efficiency ("historically 60 to 70%"), the inability to target repeated sequences, and a loss of heterozygosity (that is, unequal capture of both chromosomal copies). PCR, by contrast, has nearly 100% sensitivity, he says. ("If we fall below 99.5%, we're having a bad day," Banerjee quips.)

RainDance Technologies' Sequence Enrichment Solution enables the simultaneous PCR-based capture of several thousand discrete amplicons by generating pools of millions of picoliter-sized reaction vessels, each of which contains template, enzyme, amplication reagents, and one of up to 20,000 different primer pairs.

That's too small to capture a complete exome, but according to Banerjee, the exome isn't RainDance's target anyway. Instead, RainDance directs its microdroplet technology at more discrete studies, such as validating potential biomarkers. "What do you do after a GWAS [genome-wide association] study?" he asks. "You need targeted resequencing."

Here's how it works: The user selects the primer pairs of interest and sends those sequences to RainDance. The company then synthesizes those oligos, encapsulates them in individual picoliter droplets, and returns the resulting pool to the customer. The customer then loads that pool, along with genomic DNA template and amplication reagents, into the company's RDT 1000 system, which merges the materials to create the final, amplification-ready mixture, which is then amplified in a standard thermocycler.

According to Banerjee, the company will soon expand its offerings with kits for "ultra-deep sequencing" (to capture sequence alleles with frequencies as low as 1%) and to sequence methylated DNA.

Microfluidics firm Fluidigm also supports PCR-based enrichment. The company's Access Array™ "integrated fluidic circuit" automates the amplification of up to 48 discrete amplicons in each of 48 samples, for a total 2,304 reactions.

Whatever the particular targeting strategy, given the rate at which sequencing rates are increasing and prices are falling, many are asking whether sequence enrichment even has a future. The short answer, says Andy Watson, vice president of SOLID Systems Product Management at Life Technologies, is yes.

While it is true that sequence enrichment adds additional steps and cost to the sequencing process, not to mention the possibility of bias, some applications simply require targeted approaches—identification of rare sequence variants in tumors, for instance.

"The high depth of coverage required to detect mutations present in less than 1% of the cells in a sample currently makes whole-genome sequencing impractical," he says.

If nothing else, he concludes, "Somatic mutation detection will be the driver of the use of targeted sequencing."

References

1SB Ng, et al., "Exome sequencing identifies the cause of a mendelian disorder," Nat Genet, 42:30-5, 2010.

2 M Choi, et al., "Genetic diagnosis by whole exome capture and massively parallel DNA sequencing," PNAS, 106[45]:19096-19101, 2009.

3 A Hoischen, et al., "De novo mutations of SETBP1 cause Schinzel-Giedion syndrome," Nat Genet, 42[6]:483-5 2010.

4 J Johnston, et al., "Massively parallel sequencing of exons on the X chromosome identifies RBM10 as the gene that causes a syndromic form of cleft palate," Am J Hum Genet, 86:743-8, 2010.

5 T Walsh, et al., "Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing," PNAS, published online before print 28 June 2010, doi: 10.1073/pnas.1007983107.

Article image is from Agilent's SureSelect DNA Capture Assay product literature.

Additional Product Links

Comments

advertisement
Advertisement (image not found)

Email Newsletter Sign-Up

Stay updated on the latest technologies and news with Biocompare's newsletters
(See samples here)






Select All

Loading

Loading