Digital Gene Expression

Editorial Article

Article Tools
  • Email a Colleague
  • Print
  • Comments
  • ShareThis

Monday April 06, 2009

Caitlin Smith

Hot on the heels of microarrays is digital gene expression for the analysis of the transcriptome—all the mRNA being expressed at a given time. But so much important work has already been done on microarrays, you might say, surely it wouldn’t pay to change platforms now.

In fact, it might be worth considering. Using next-generation sequencing technologies to generate large amounts of data fast, digital gene expression has several advantages over its microarray predecessor: an unbiased view of the transcriptome; detection of different levels of polyadenylation and antisense transcription; better measurements of very low-abundance transcripts; the ability to measure transcripts for which there were no probes on arrays; and better inter-lab reproducibility (microarray results can vary despite attempts to control conditions). “Our understanding of the transcriptome is constantly evolving, making it difficult for microarrays to stay current,” says Roland Wicki, Applied Biosystems' director of SOLiD strategy at Life Technologies. Is there a future for microarrays in the face of such an attractive competitor?

Transcriptome profiling by sequencing

Profiling the transcriptome by sequencing is a powerful discovery technique. “The value of sequencing lies in the fact that no sequence knowledge is required and the dynamic range is almost unlimited. Small RNA microarrays, in particular, suffer from hybridization bias due to the short length and lack of sequence complexity,” notes Wicki. Applied Biosystems’ (now Life Technologies) SOLiD™ 3 System is designed for transcriptome profiling by sequencing, and uses massive parallel sequencing and barcoding to achieve high-throughput results. “Sequencing is the best way to truly profile all aspects of sequence variation, such as isomirs and RNA editing. In addition, you can look at all RNA that is transcribed, not just polyA+ and miRNAs,” says Wicki. “[Besides] the profiling aspects, one might want to discover point mutations in expressed transcripts, or detect allele-specific expression of coding SNPs. Also, fusion transcripts, known or unknown, may be detected by sequencing the entire transcriptome.”

Like the SOLiD system, Illumina’s Genome Analyzer along with digital gene expression kits is designed for discovery and other applications. The mRNA-Seq kit finds and counts full-length polyA transcript isoforms, the tag profiling kit sequences short 3’ transcript fragments, and the small RNA discovery kit finds small RNAs without sequence information. “All these are discovery applications for finding new things, but they are also quantitative so you can do very nice profiling experiments,” says Shawn Baker, Market Manager for Expression and Regulation at Illumina. “One of the nice things about all of these kits is that their sensitivity is limited by how deeply you sequence—and that’s in contrast to arrays. So the more deeply you sequence, the finer a difference you’re going to be able to detect.”

Baker says that a major challenge to the field of digital gene expression is both cost and throughput: “Compared to arrays, sequencing is still more expensive and the throughput is lower.” But scientists at Illumina are currently working on this too—through improvements in technology, chemistry, and software, Illumina expects to achieve 95 GB of data on a single flow cell by the end of the year. “And more importantly for transcription applications, that translates into about 300-350 million useable reads [equivalent to 600-700 million paired-end tags], and that’s up from about 80-100 million reads today,” notes Baker. “That will start increasing the throughput and driving down the cost.”

GenXPro recently improved their tag-based SuperTag-DGE and SuperSAGE methods using longer tags (26 base pairs) for greater precision and accuracy in digital gene expression profiling. “The longer tags can be annotated to their corresponding transcripts with much higher precision and allow to distinguish between many transcripts, which cannot be discriminated with only 21 base pairs,” Björn Rotter, head of functional genomics at GenXPro. “The other major advantage of the longer tag is that unknown transcripts that are identified in digital gene expression can be studied further, because the tag information can be comfortably used to design specific PCR primers for RACE or other PCR-based approaches.” This makes SuperTAG-DGE an especially useful tool for non-model organisms, for example.

Another advantage of longer tags is the ability to differentiate clearly the transcripts of hosts versus their parasites or pathogens. “For example, we frequently identify viral transcripts or other unexpected transcripts, which can often not be clearly differentiated from the host with shorter tags,” says Rotter. GenXPro has also developed a method to eliminate bias introduced by PCR. This has tremendous value for methods involving PCR because “certain tags or cDNAs will be amplified better and more than others,” says Rotter. “This will introduce a quantification error into the data.”

Single-molecule sequencing in action

Single-molecule sequencing is a method that sequences without amplification or ligation, thereby eliminating a source of bias. Helicos BioSciences’ True Single Molecule Sequencing (tSMS)™ technology is at the heart of its digital gene expression applications, such as Helicos™ Digital Gene Expression, Helicos™ RNA-Seq and Helicos™ Small RNA Seq. “Helicos Digital Gene Expression produces one cDNA read per mRNA molecule, providing a very quantitative measurement of expression, without the need for deep sequencing,” says Avak Kahvejian, senior product manager at Helicos BioSciences. “Helicos RNA-Seq produces reads from throughout an mRNA molecule, providing even sequence coverage of the transcriptome, and a comprehensive view of alternative splicing and allele-specific expression.”

NanoString Technologies also offers single-molecule sequencing in their nCounter™ Analysis System, which uses single-molecule imaging for direct counting, enabled by fluorescent molecular barcodes (a unique string of colored fluorophores). These barcodes, which are directly hybridized to mRNA molecules, allow multiplexing of up to 550 different transcripts in one reaction.

Software for analysis

Central to digital gene expression is software to handle the analysis. Bioinformatics company DNASTAR recently released ArrayStar v3.0, whose optional module, QSeq, was designed for digital gene expression and RNA-seq applications. “What we are doing differently than most is combining the visualizations and analytical tools in our microarray expression analysis software with a tool (QSeq) that allows users to work with sequencing data,” says Robert Steinhauser, marketing director at DNASTAR.

Another bioinformatics provider, CLC bio, recently released version 3.0 of their Genomics Workbench. “The software contains an RNA-seq module that makes it very easy for the user to map a full dataset of cDNA reads to either an annotated genome or an EST library,” explains Roald Forsberg, director of scientific software solutions at CLC bio. “The output of the mapping analysis includes detailed information concerning gene expression levels and novel putative exons, and is presented to the user as both a visual mapping result and as tabular information. These results can then be further analyzed using the elaborate gene expression analysis and visualization tools also included in the program.” Forsberg adds that their new release stands out as the first graphical user interface for a full RNA-seq workflow. In coming weeks, CLC bio will also offer the tools needed for tag-based digital gene expression analysis.

In the excitement of digital gene expression, what will become of microarrays? Forsberg believes that traditional arrays are not going away soon. “We believe that it is important for users to be able to access the wealth of array data that has been generated over the last years with the large investment in knowledge, time and resources that they represent,” he says. “For this reason, we have developed an expression analysis package that allows the user to combine gene expression results from both analog and digital gene expression analysis.” Maybe it is possible to have the best of both techniques after all.

Additional Product Links

Additional Article Links

Comments

advertisement

Email Newsletter Sign-Up

Stay updated on the latest technologies and news with Biocompare's newsletters
(See samples here)






Select All

Loading

Loading