Human and other mammalian genomes pervasively transcribe tens of thousands of long non-coding RNAs (lncRNAs). The latest edition of data produced by the public research consortium GenCode (version #27) catalogs just under 16,000 lncRNAs in the human genome, producing nearly 28,000 transcripts; when other databases are included, more than 40,000 lncRNAs are known.

These mRNA-like transcripts have been found to play a controlling role at nearly all levels of gene regulation, and in biological processes like embryonic development. A growing body of evidence also suggests that aberrantly expressed lncRNAs play important roles in multiple disease states, including cancer. But although these RNAs—distinguished from smaller regulatory RNAs by their length of more than 200 nucleotides—have been known for more than a decade, only a few have been functionally characterized.

“You have to start by detecting the non-coding transcript,” says David Corey, Ph.D., professor of pharmacology at the University of Texas Southwestern Medical Campus in Dallas, whose research focuses on lncRNAs as drug targets. “I’m not a big fan of using databases to see if someone else has detected it, because of the possibility for errors. They may have detected an RNA that isn’t there in your cell line, or only there in a small amount—or even worse, miss something.”

When not looking genome-wide, Dr. Corey utilizes quantitative PCR to detect the non-coding transcript. “If we think there’s a non-coding transcript there, we’ll use 5’ and 3’ RACE [Rapid Amplification of cDNA Ends] to amplify and characterize the regions of unknown sequencing, and get a feel for how far that transcript goes,” he says. “Now it’s not essential to do that, but knowing where the sequence starts and where it ends is a good piece of information to have. That said, if you’re primarily interested in function and manipulating function, you may not have to know the entire thing, and it can be difficult to detect if there are overlapping transcripts.”

Biocompare’s RNA Search Tool
Find, compare and review RNA tools
from different suppliers Search

If there’s a transcript there, the next step is using reverse transcription to make the RNA. “Or you can buy a suitably long piece of RNA, but it should be the same sequence you’re interested in,” says Dr. Corey. “You want to use the exact same qPCR primers—because it makes a big difference whether something is present at one copy per thousand cells, one copy per cell, or one thousand copies per cell.”

Profiling these lncRNAs, once identified, poses its own unique challenges compared with smaller mRNAs,” says Yanggu Shi, Ph.D., a senior scientist with ArrayStar, which specializes in tools and technology for the analysis of expression profiling and regulation of RNAs, particularly regulatory non-coding RNAs. “For example, lncRNA expression abundance levels are generally lower—about tenfold lower on average—and their high tissue specificity contributes to a diluted presence in total RNA (nearly 80% of lncRNAs are tissue specific, compared with less than 20% of mRNAs). Long non-coding RNAs are also less annotated because they do not encode proteins. While some lncRNAs have a poly(A) tail, others do not. And they are mostly located in the nucleus. For all these reasons, we need special ways to analyze and annotate them.”

For example, typical RNA sequencing is 40 million reads, the sequencing read coverage for lncRNAs is too low for reliable quantification. “Using this approach, lncRNA sequencing requires at least 100 million reads, much more than the usual mRNA sequencing,” Dr. Shi explains. “Microarrays, by comparison, are relatively unaffected by the transcript levels at low abundance, and have low rates of quantification error.”

Dr. Shi says that ArrayStar’s microarrays for lncRNA are well annotated, functionally studied, and experimentally supported, with multiple publications in the literature. As an example, he points to the successful use of ArrayStar microarrays in work published in Nature Cell Biology, demonstrating that the cytoplasmic LINK-A lncRNA activates normoxic HIF1α signalling in triple-negative breast cancer.

A giant step toward systematically analyzing the lncRNA sequences that impart nuclear localization has recently been reported by the laboratory of John Rinn, Ph.D., formerly of Harvard and MIT’s Broad Institute and now the Leslie Orgel Professor of RNA Science at the University of Colorado’s BioFrontiers Institute. First published on the bioRxiv online archive run by Cold Spring Harbor Laboratory in September 2017, Dr. Rinn’s paper describes the development of a massively parallel reporter assay (MPRA) uniquely designed to identify sequences sufficient for RNA nuclear enrichment for 38 human lncRNAs.

“Massively parallel reporter assays have been great for DNA,” says Dr. Rinn. “They chug through everything that looks active, and decide which is really active. We needed something like that for RNA nuclear enrichment. This assay allows us to put 100,000 oligos or more in a pool, and see which of these are sufficient to bring that cytoplasmic reporter back into the nucleus. Anything where you can select activity vs. non-activity, plus or minus, we can separate the winners from the losers.”

Using this approach, Dr. Rinn and his team identified 109 unique, conserved nuclear enrichment regions, originating from 29 distinct lncRNAs. They also discovered two shorter motifs within their nuclear enrichment regions, and further validated the sufficiency of several regions to impart nuclear localization using single molecule RNA fluorescence in situ hybridization (smRNA-FISH).

“With RNA, it all comes down to structure,” Dr. Rinn says. “Consider the most well-studied RNA, TERC, the telomerase RNA. It’s absolutely required for viability. The yeast and human telomerase are different sizes and sequences yet you can swap them and they still work the same. Evolution has not preserved the sequence of this gene, but it has preserved the structure and the binding partners. With this assay, we can add mutations that would change the sequence, but preserve the structure, and ask if that’s still sufficient to bind it.” Compared with classic biochemical assays, it’s like going from Coleco Vision or Pong to an Xbox, with so much more resolution.”

In his lab, Dr. Corey continues to look for lncRNAs with a possible therapeutic target. “One we are particularly interested in right now is a repeat expansion within an intron related to gene expression in Friedrich’s ataxia,” he says. “If we can upregulate the target, we may have a treatment for the disease.”

Image courtesy of Dreamstimes Images.