Library Design and HT Sequencing are Crucial to Obtain
Quality Results from Pooled shRNA Library Screening
Previously, we attempted a genome-wide RNAi-knockdown screening to identify signaling molecules in FAS-induced
HeLa cell model with a commercially-available lentiviral-based pooled shRNA library constructed with 200,000 shRNA.
The results were characterized by a high percentage of both false positives and false negatives results with only about 4%
of putative positives correlating to modulators in the known apoptotic signal transduction pathways. These findings
indicated a potentially useful but problematic screening. In an effort to optimize the screening process and assess the
effectiveness of pooled shRNA libraries, we designed an improved Cellecta shRNA Lentiviral Library to perform a similar
Description of Screening Problem—Few True Positives, Lots of False Positives
Gene knock down screening with a commercially-available pooled lentiviralbased
shRNA library constructed with 200,000 shRNAs was undertaken to
identify genes modulating FAS-induced apoptosis of HeLa D98/AH2 cells. The
library contained 4-5 shRNA inserts targeting each of the 47,000 transcripts
represented on the Affymetrix Human Genome U133+2 GeneChip array. Each
of the shRNA sequences were selected to be compatible with one of the probe
sequences on the Genome U133+2 array with the purpose of using
hybridization to this array as a means to detect shRNA sequences expressed
by cell populations.
In duplicate, HeLa cells (1x108 cells) were transduced with the packaged 200K
shRNA library (MOI = 0.3), then treated with anti-Fas receptor antibodies that
activate the FAS-pathway and induce apoptosis, and grown for seven days. In
control cells the FAS pathway was not induced. The entire pool of siRNA inserts
was amplified by two rounds of PCR using loop-specific and vector-specific
primers from 5 ug of cDNA from control (untreated) and Fas-treated cells.
Amplified DNA was hybridized to the Affymetrix HG U133+2 arrays (Figure on
The selection protocol was efficient and resulted in about 95% cell death.
However, despite this stringency, we identified approximately 2,500 shRNA
sequences that were significantly enriched in the treated vs. control
populations. However, less than 50% of these overlapped between the two
independent experiments, indicating an abundance of false positives. Among
the positives, there were 20 apoptosis-related gene candidates and three wellknown
apoptosis-related genes (bax, bcl-xL, and FasL) which indicate the
potential of the approach. However, the excessively high number of false
positives (background) makes it impossible to select real positives based on
just the experimental results. In addition, several well-known genes involved in
activation of Fas-induced apoptosis, such as CASP8, BID, and DR4 were not
revealed which indicates a high percentage of false negatives.
Analysis of 200K Screening Results—Library Problem
The results of the screening and hybridization with the shRNA library
constructed with 200,000 shRNAs generated too many false positives to
confidently identify shRNA sequences enriched in the target population. To
address this, we decided to repeat the RNAi screen with a lower complexity
shRNA library and looked at using a more quantitative method, HT sequencing, to assess the abundance of shRNA in the sample population. We found that
we could use high-throughput (HT) sequencing to address both of these points.
First, we looked at the library. When the library itself was hybridized to the Affymetrix HG U133+2 arrays, only
approximately half the shRNA sequences were detectable. This may be a result of weak hybridization for this sequence
or the sequence is missing from the library. To assess this, we attempted to directly amplify by PCR 40 shRNA
sequences that were not detected in the packaged library with the array. 36 of these were not detected, indicating that
they are missing from the library. Thus, although the library was constructed with 200,000 shRNAs, it seems that
approximately 40% of the shRNA are missing. This indicates that there is probably also a large variability in the remaining
sequences with some being very abundant and others in low numbers.
In looking at the array hybridization data, two other problems also appeared to occur. (1) There was some cross
hybridization of signals which contributed to the false positive background, and (2) there was a limited dynamic range from
the lowest to the highest enrichment levels of approximately 100-fold. Below this level, there may be some hybridization
but the shRNA sequence was undetectable and, at the high-end, the signals plateau so any enrichment in shRNA
between control and treated populations is undetectable.
New Library to Improve Screening Effectiveness
To address the shRNA representation, signal cross hybridization, and the low
dynamic range of array readouts, we constructed a new library where each
shRNA insert also contained a uniquely identifiable sequence (a “barcode”) that
was compatible with Illumina GAIIx sequencing technology so it could be
detected by HT sequencing. We also made the library with a pool of just 38,000
shRNA designed to target 8,000 well-annotated genes. This smaller size library
could be more practical to screen exhaustively and ensure that all shRNA are
interrogated in the assay.
Analysis of the Cellecta 38K shRNA Library by HT sequencing demonstrated
that, after ligation and packaging, all sequences were present. Further, that at
least 95% of the shRNA fell within a range were there was less than a 10-fold
difference from the least to the most abundant (see figure)—although at least 5
orders of magnitude in range were easily discerned.
Results and Conclusion with Improved Quality Library
Screening with the Cellecta 38K shRNA Library found approximately 350
shRNAs enriched in the treated population. Of these, 150 (or 43%) were known
modulators of apoptosis and fell into the expected pathways related to NF-κB
and p53 pathways.
The results demonstrate that careful RNAi knockdown screening of a targeted
set of genes with a well designed pooled library can provide a comprehensive
interrogation of the genes responsible for a particular biological response, in this
case, FAS-induced apoptosis, The marked difference in the results of the
Cellecta 38K shRNA screen compared with the 200K shRNA emphasize the
importance of using representation and complexity of the library in ensuring a
screening produces usable results.