Whole Genome Sequencing Technologies Enhance Speed and Throughput

 Whole Genome Sequencing’s Speedy Pace
Caitlin Smith has a B.A. in biology from Reed College, a Ph.D. in neuroscience from Yale University, and completed postdoctoral work at the Vollum Institute.

Not long ago, we might have scoffed at television depictions of scientists walking into a lab with a DNA sample and emerging an hour later with a sequenced genome. But scoff no more. Today, whole-genome sequencing (WGS) can sequence an entire genome at once, and it is advancing quickly enough to sequence a whole genome in about 24 hours. Cost remains a barrier, but already WGS is being used to sequence ancient DNA samples of early hominids and plant genomes of agricultural importance. Here is a look at some of today’s technologies and talking points in the fast-moving WGS field.

Throughput and longer read lengths

Advances in speed and throughput of WGS are astonishing. “A [human] haploid genome is roughly 3 billion bases,” says Joel Fellis, Illumina’s market manager for sequencing systems. “However, [in WGS] you don’t sequence that just once. You typically do that 30 to 40 times to get 30X to 40X coverage. If you’re sequencing a tumor genome, you might want total coverage from 80X to 100X to detect really rare events.” Today, Illumina’s WGS technology can sequence an entire human genome in about 24 hours. In contrast, the Human Genome Project was completed in about 13 years—a snail’s pace by today’s standards.

Today’s speed is partly due to faster chemistry and faster imaging that has allowed Illumina to double its daily throughput while shortening sequencing runs from 10 days down to a day.   Other advances in WGS have come from paired-end sequencing and longer read lengths. “With the HiSeq 2500, you can further extend read-lengths to enable 50X to 60X coverage  in about 40 hours,” says Fellis.  Illumina’s recent acquisition of Moleculo offers an innovative technology that will produce synthetic long reads up to 10,000 bases at an extremely low error rate.   llumina’s paired-end technology not only allows better coverage, but also more accurate detection of structural changes. In addition to “short-insert” paired-end sequencing kits for inserts of 300 to 500 bases, Illumina offers “mate-pair” kits for longer DNA inserts of up to 15,000 bases.  “You might want to use this if you’re trying to span repetitive regions, looking for structural variants, or doing a de novo assembly,” says Fellis. “This is really critical because it allows you to build a scaffold more easily,” such as for whole tumor genome sequencing or de novo assembly.

Other WGS technologies also have improved performance by increasing read-lengths. For example, Pacific Biosciences’ PacBio RS II is a WGS sequencing platform based on the company’s SMRT sequencing technology. The PacBio RS II’s sequencing throughput is double that of its predecessor because the number of simultaneously observable sequencing reactions has doubled, says Pacific Biosciences’ chief science officer Jonas Korlach: “It generates read-lengths averaging 5,000 base pairs, with the longest reads above 20,000 base pairs in length.” Thanks to its SMRT sequencing technology, the PacBio RS II also can reveal epigenetic modifications, such as methylation of DNA bases, in bacteria, and according to Korlach, possibly in larger genomes in the future.

Longer read-lengths also advance Roche Applied Science’s GS FLX+ and GS Junior sequencing systems for smaller genomes. “The long reads generated from the GS FLX+ system in hybrid assemblies (combining data from several sequencing platforms) of large genomes allow researchers to close gaps in reference sequences, improving the quality and completeness of reference genomes,” says Claudia Schmitt, head of communication at Roche Applied Science. Recently, a hybrid assembly using GS FLX+ v2.8 closed some large gaps in the human reference genome. This summer, Roche plans to release an upgrade supporting metagenomic analyses and deep sequencing of pathogens.

Making sense of the data

Not surprisingly, it is challenging to keep the sudden reams of sequencing data organized in a meaningful way. “If you have 100 billion bases of data from a human genome, how do you analyze that?” says Fellis. “It’s quite intimidating for the average biologist.” Illumina’s new cloud-computing environment, BaseSpace, attempts to manage the data avalanche intelligent manner. “The vision is that you just need an Internet connection, and you have all the tools that you’d want for analyzing a human genome at your finger touch,” says Fellis. “We’re approaching an age where biologists can do this—it doesn’t have to be the domain of bioinformaticists anymore.” BaseSpace offers instant backup of data, fast data processing, access to the tools in Illumina’s sequencing and alignment workflows and many third-party sequencing applications.

WGS approaches the clinic

WGS is making “personalized medicine” more of a reality with each additional year that sequencing costs decline. “Right now, in the clinic, WGS is still pretty uncommon,” says Jonas Lee, senior vice president of marketing and corporate development at Knome. “Once prices fall to below $1,000, however, WGS will get really interesting as genetic-testing labs find it cheaper to sequence the whole genome, even if all of the data are not immediately needed. ‘Banking’ the genome will then allow labs to run future tests in silico, shortening testing times and taking costs out of the system.”

With many clinicians unaccustomed to reading or interpreting genetic sequencing data, companies like Knome are helping to bridge the gap. “We offer interpretation tools that help doctors make sense of next-generation sequencing data—for targeted panels, exomes, all the way up to whole genomes,” says Lee. “Our tools are packaged together into the knoSYS100 appliance, an end-to-end human genome interpretation system that integrates an advanced genome interpretation application (knoSOFT) and a powerful informatics engine (kGAP) with a high-performance grid-computing system.”

Researchers tend to view genomic sequencing as falling squarely within the domain of the research lab, but as WGS prices fall, and bridges between labs and clinics multiply, you may encounter sequencing in your own doctor’s office in the very near future.

  • <<
  • >>

Join the discussion