Next Generation Sequencing

Next Generation Sequencing

by Jeffrey M. Perkel

If you want to get a sense of the current state of the high-throughput sequencing market, look no further than this month's news.

First, the US Department of Energy's Joint Genome Institute mothballed the last of its fleet of Sanger chemistry-based sequencers, completing the transition to newer, faster, next-gen sequencers that has been in the works for several years.

"With these new sequencers incorporated into the production line over the last two years, our productivity has risen to 1 terabase in FY09; 5 Tb in FY10 and to a projected over 25 Tb in FY11," GenomeWeb quotes JGI Spokesman David Gilbert as saying. [1] "To put this in perspective, our total commitment to DOE in FY98 was 20 megabases, which we do now in a few minutes."

The second item was the initial data release from the 1000 Genomes Project Consortium, an effort to sequence the genomes of 1,000 humans and thereby get a handle on human sequence variation. In a report in the journal Nature, the Consortium detailed the sequencing and analysis of nearly 900 individual genomes (179 full genomes and 697 partial exomes), as part of the project's "pilot phase," using a blend of next-gen sequencers from Illumina, 454 (a Roche company), and Life Technologies. [2]

Remarkable as that achievement is, it represents just a fraction of next-gen sequencing output to date. According to an infographic accompanying the article, "at least 2,700 human genomes will have been completed by the end of this month [October 2010], and [the] total will rise to more than 30,000 by the end of 2011." [3]

The final news item: One of those 2,700 genomes belongs to none other than rocker Ozzy Osbourne, of MTV's The Osbournes and biting-the-head-off-a-live-bat fame, who wrote of the experience in the October 24 Sunday Times of London. According to Scientific American, [4]:

"I was curious," he wrote in his column. "Given the swimming pools of booze I've guzzled over the years—not to mention all of the cocaine, morphine, sleeping pills, cough syrup, LSD, Rohypnol…you name it—there's really no plausible medical reason why I should still be alive. Maybe my DNA could say why."

If the JGI announcement and 1000 Genomes Project data release speaks the fact that sequencing whole genomes is, as Jay Therrien, vice president of commercial operations for next-gen sequencing at Life Technologies, puts it, "basically routine," the Osbourne sequencing project attests to how far there still is to go.

"We're in this era at the moment of celebrity genomics," says Daniel MacArthur, a UK-based postdoctoral fellow who blogs [5] and tweets [6] extensively about the next-gen sequencing industry. "That will persist for a while until the cost goes down enough that ordinary people can actually afford to do it. And I guess that's when it will get really interesting."

Of course, from a technology point-of-view, the next-gen sequencing arena has been interesting for years—Harvard geneticist George Church estimates the industry cost has plummeted about 10-fold per year for each of the past five years—and even if that pace is slowing a bit (Church estimates this year's improvement at between three and five fold) it continues to be so.

To wit: the rise of "personal" next-gen platforms. All three of the major sequencing companies, Life Technologies, Roche/454, and Illumina, have announced such devices, which provide a lower-cost, lower-throughput alternative for those researchers who would like to take advantage of next-gen sequencing, but have neither the resources nor the need for the industrial-scale equipment that previously was their only option.

"To keep the latest generation of Illumina fully loaded, you need to have a 400-gigabase-pair project. Most people don't have a 400-Gbp project," says Church, who is a scientific advisor for some 18 next-gen firms, including all six with commercial products (Dover Systems, Roche/454, Life Technologies, Illumina, Complete Genomics, and Helicos).

First out of the gate was Roche/454, which announced its GS Junior system late in 2009. Priced at around $100,000 (as compared to $500,000 for the company's top-of-the-line GS FLX), the GS Junior runs the same pyrosequencing chemistry as the GS FLX, but at a lower throughput: 100,000 parallel reactions, compared to one million on the GS FLX.

"It's a scaled down version of our big system," says Katie Montgomery, marketing communications manager at 454 Life Sciences.

At about 400 bases apiece, those reads currently lead the industry in terms of length. But in 2011, the company plans an upgrade to about a kilobase, says Montgomery, adding that this will be available to existing users as "a small hardware upgrade to accommodate the increased reagent volumes."

On Oct. 26, Illumina announced a new member of its sequencer line, as well. The HiSeq 1000 "is designed for researchers who want the ease of use, industry-leading cost per gigabase (Gb) and data rate of the HiSeq 2000 but do not currently require its throughput," the company said in a press release (Illumina was unavailable to comment for this article). [7]

This "single flow cell version" of the HiSeq 2000 "will deliver in excess of 100 Gb of data per run using paired 100 base pair reads, easily enabling the sequencing of a complete human genome in a single run," according to the release.

Finally, at the American Society of Human Genetics annual meeting this week, Life Technologies announced two new additions to its line of SOLiD sequencers. On the high end, the company is launching the SOLiD 5500xl. Built in collaboration with Hitachi, the 5500xl will generate twice the data of SOLiD 4 (200 Gbp per run) at half the cost ($6,000 per run) and in half the time (5-6 days vs 10-12), Therrien says.

"You can sequence an entire human genome at 30x coverage at a price of $3,000, which was unheard of just a year ago," says Therrien.

At the same time, the company also announced a personal option. Priced at $299,000 (vs $595,000 for the 5500xl), the SOLiD 5500 base system is essentially a single flow-cell version of the 5500xl.

Life Technologies is also gearing up to commercialize an entirely new sequencing technology this November. Based on its recent acquisition of Ion Torrent Systems for some $725 million, the Personal Gene Machine (PGM) will provide up to 10 Megabases worth of 100-base reads in just two hours for about $500, according to Therrien. (An upgrade to 100 Mbp per run is planned for release "early next year," he adds.)

That, Church notes, is 1,600 times more expensive per-base than the SOLiD 4. But the company, says Therrien, is positioning it as mid-way between a Sanger capillary electrophoresis-based instrument and the SOLiD, for applications such as bacterial and viral genomics and targeted amplicon sequencing. "What that gets you is a radical reduction in turnaround time for really what is a very large amount of sequence data," he says.

The Ion Torrent consumable "is a computer chip that's been modified so we can flow biologics into it," Therrien says. Amplified DNA templates on silicon beads are flowed into that flow cell, where they sit in tiny wells. At the bottom of those wells is basically "the smallest pH meter in the world." As nucleotides are flowed into the reaction chamber one by one and added to the growing DNA chain by DNA polymerase, they release protons, causing a pH drop that registers as a change in voltage.

It's a design that requires no optics, no fluorescence, and no imaging; "We call it 'post-light sequencing,’" Therrien says. It is also, for that reason, considerably less expensive than other next-gen platforms, costing just $52,000 for the hardware.

"It's a clever system," says MacArthur. "I think it's really quite elegant, but how well it actually works in the field will be the big test."

(In related news, Roche/454 announced Monday a partnership with DNA Electronics "for the development of a low-cost, high-throughput DNA sequencing system," according to a press release. Details are sketchy, but the described system bears certain similarities to Ion Torrent's technology. According to the release, the system will use "inexpensive, highly scalable electrochemical detection"—as opposed to optical detection—and "leverages 454 Life Sciences' long read sequencing chemistry with DNA Electronics' unique knowledge of semiconductor design and expertise in pH-mediated detection of nucleotide insertions, to produce a long read, high density sequencing platform.")

Of course, not all sequencing will be done on these platforms. Sanger sequencing remains a powerful force in the industry. "If you just want to check one fact in 24 hours, as biologists often do, it's $2 for a 700 bp run," Church says. "And that's a no-brainer."

At the same time, sequencing firms are also pursuing the "next-next" generation of instruments.

One such technology is Project Starlight, which Life Technologies discussed at the Advances in Genome Biology and Technology meeting in February. Project Starlight is a sequencing approach based on single-molecule fluorescence resonance energy transfer (FRET) between a FRET donor-bearing DNA polymerase and FRET acceptor bearing nucleotide triphosphates.

Unlike most commercial sequencing systems, which amplify a template prior to sequencing it, Starlight sequences individual molecules directly. (Helicos BioScience's HeliScope commercial sequencer is also a single-molecule technology, as is Pacific Biosciences' in-development SMRT technology.) According to Therrien, the company is "currently targeting a commercial release in mid-2011," with 1-kb read lengths—about twice as long as any current next-gen technology—at launch.

Such long reads are sorely needed, MacArthur says. Short reads (such as the products of the SOLiD and Illumina chemistries) make it difficult to assemble a genome without a reference scaffold (that is, de novo), especially if the genome contains repetitive sequences. A few long reads could go a long way towards overcoming that problem, he says.

"Once you start pushing beyond about a kilobase or so, that starts giving you some real power," MacArthur says. "If you can sprinkle even a few of these kind of longer reads into a sequencing project that's already generating lots and lots of short reads, then that potentially can make a big difference to how well you can put the genome together."

Another next-next-gen technology in development is nanopore-based sequencing, in which DNA is "read" as it passes through nanometer-scale holes. Oxford Nanopore has been pursuing that approach for several years now. More recently, Roche/454, in partnership with IBM, entered the fray.

According to a press release [8] announcing the latter collaboration, IBM's in-development approach is based on the company's “DNA Transistor” technology. "The novel technology, developed by IBM Research, offers true single molecule sequencing by decoding molecules of DNA as they are threaded through a nanometer-sized pore in a silicon chip," the release explains.

"It's still an early-stage research project at this point, but Roche is very interested in the future of sequencing," says Montgomery. "I think there's still a lot yet on the horizon to come."

For the next-generation sequencing market overall, that surely is the case.

References
[1] GenomeWeb

[2] The 1000 Genomes Project Consortium, "A map of human genome variation from population-scale sequencing," Nature, 467:1061-73, Oct. 28, 2010.

[3] "Genomes by the thousand," Nature, 467:1026-7, Oct. 28, 2010.

[4] K. Harmon, "Ozzy Osbourne's genome reveals some Neandertal lineage," Scientific American, Oct. 26, 2010.

[5] Daniel MacArthur blog

[6] Daniel MacArthur twitter

[7] Illumina Announces HiSeq 1000 Sequencing System

[8] Roche and IBM Collaborate to Develop Nanopore-Based DNA Sequencing Technology

Comments