Determining gene function and the relationship between genotype and phenotype is the ultimate goal of functional genomics and is crucial for furthering our understanding of cellular biology and processes. Pooled CRISPR-Cas9 screens are now a popular approach to functional genomics at scale—delivery of genome-wide sgRNA libraries by lentiviral vector allows thousands of genes to be targeted simultaneously. As long as the gene contains a PAM site (protospacer adjacent motif), the Cas9 endonuclease can be easily targeted to a specific sequence by the sgRNA.

The most commonly used Cas9 from S. pyogenes targets the 5’-NGG PAM site, which occurs on average once every 8bp within the human genome, and so there is an abundance of potential sgRNA to choose from.1 But careful selection of which ones to include in the library is key to the success of a screen—not all sgRNA are created equal and there is variability in the efficiency and specificity of individual sgRNA. The sequence of the target sgRNA (sequence specific activity), homology to sequences elsewhere in the genome (off-target activity), and DNA binding context of the sgRNA (context-specific activity) all contribute to a variance in sgRNA efficacy. Library design therefore requires good genomic characterization data to ensure biological relevance of the target sequence, and application of algorithms and design rules to include only the most efficient guides in the library.

The ability of the CRISPR-Cas9 system to target virtually any genomic locus makes it inherently flexible but can also mean that designing an optimal sgRNA library for screening can be challenging—especially if one wishes to incorporate all the latest best practices with respect to guide design. Fortunately, there are several commercially available libraries that can be utilized to avoid the time-consuming process of library design and validation, with sgRNAs selected for optimal efficiency and biological significance. There remain, however some considerations when selecting such a pre-designed library.

The CRISPR-Cas9 toolkit

The first choice to make when embarking on a CRISPR-Cas9 screen is the type of genetic perturbation that will be employed, which is largely based upon the biological question you wish to address. The most common choice is a CRISPR-KO library, where the Cas9 protein introduces a double-strand break in the DNA and ablates gene function due to the introduction of frameshift mutations caused by error protein NHEJ-based DNA repair. The sgRNA of these libraries target constitutively expressed exons and typically target the 5’ end, as these are more likely to result in a disabling deletion. However, guides targeting the functional domains of proteins can also be effective in knocking out gene function, even if frameshift mutations are not introduced.2

Gene knockout can be useful for identification of those genes whose loss results in resistance or sensitization to screening conditions (i.e. drug resistance). However, in many diseases it is modulation of gene expression and associated phenotypic changes that drive pathology, and so other modes of perturbing the genome may be beneficial. For example, it may be preferable to downregulate rather than completely knock out gene expression—such as when investigating non-coding regions, essential genes, or where changes to phenotypes are observed at different expression levels. CRISPR interference or CRISPRi libraries make use of a catalytically inactive version of Cas9 (dCas9) tethered to a transcriptional repressor, interfering with transcription and resulting in gene knockdown. By contrast, dCas9 fused to a transcriptional activator, such as VP64 or SunTAG, can be used to increase gene expression, allowing an alternative screening approach known as CRISPR activation or CRISPRa. Selecting a CRISPRa library means that a genome-wide, gain of function screen can be performed without the requirement of cloning and overexpressing the cDNA of interest.

How many guides per gene?

A library should contain multiple guides per gene to ensure modification of every target and statistical significance of the resulting hits, with libraries containing anywhere from 3–10 guides per gene. A balance must therefore be struck to ensure that the number of sgRNA included in the library does not outstrip the technical feasibility and budget of the screen and assay. The more sgRNA per gene (around 8–10) will ensure greater statistical certainty, which would be beneficial for identifying genes with a weaker phenotype in a primary screen. Reducing the number of guides per gene (to around 3–4) would reduce the statistical significance but result in a smaller library, thereby allowing the screen to be performed with fewer cells, which can reduce time and costs on cell culture. A smaller library can also be used to screen multiple models that can be multiplexed on a single NGS run, thereby reducing sequencing costs, or when the number of cells available is limited, such as in primary cells.

Amplifying and using your selected library

  • Some CRISPR libraries will require amplification prior to use but it is vital that library representation is maintained after amplification
  • Next-generation sequencing is used to verify library completeness and that all sgRNA are present in your library after amplification
  • Some libraries are available as ready-to-screen lentivirus

While early CRISPR screens and associated libraries were genome-wide, such as the GeCKO library,3,4 there are now various libraries available that target subsets of genes, focussing on gene families or biological pathways—for example libraries targeting kinases, cell cycle pathway proteins, or ribosomal proteins are available. By reducing the number of targets, the number of guides per gene can be increased to improve statistical confidence in hits, as well as reduce sequencing costs and the computational burden of analysis. Decreasing the size of the library can also potentially reduce the noise in the system, allowing for hits that would otherwise be masked to be identified.

Library characterization and format

Libraries are designed to contain sgRNA that are specific to a particular species so the model system to be utilized in the screen should match that of the derived library. However, commercially available libraries are designed for general use and so will make assumptions as to the targets included in the library. Selection of targets is based upon genomic data, which is constantly updated due to the dynamic nature of genome annotation. Screens using a general-use library may therefore differ in their efficacy between model systems, despite coming from the same species, so consideration must be given to the technical optimization of model system that will be used in the screen. One approach is to perform a genome-wide screen, and then follow up any potential hits on a smaller scale using a custom library.

Libraries are delivered to cells either as a one-plasmid or two-plasmid system, depending on the method of Cas9 delivery. With a two-vector system, Cas9 is delivered ahead of the sgRNA library—if Cas9 is delivered along with an inducible promoter, cells can be selected that express high levels of Cas9 so that gene knockout occurs more quickly when the sgRNA library is delivered. Originally, delivery of Cas9 and the sgRNA on separate plasmids ensured a high viral titer due to the large size of the construct, but single vector systems are now available, which reduces the number of transfections required, which can save time and money.

Screening made easy

The CRISPR-Cas9 gene-editing system provides an efficient and cost-effective way to interrogate the genome and discover the relationship between genotype and phenotype. Selecting a library should be based on the nature of the screen being performed, as well as the budget available for sequencing and analysis. A variety of pre-designed and pre-validated libraries are now available, which can allow scientists to shortcut the time-consuming and often complex task of library design while taking advantage of the latest advances in sgRNA design algorithms.

References

1. Cong, L. et al. Multiplex Genome Engineering Using CRISPR/Cas Systems. Science (80-. ). 339, 819–823 (2013)

2. Shi, J. et al. Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nature Biotechnology 33, 661–667 (2015)

3. Shalem, O. et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343,84–87 (2014)

4. Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic Screens in Human Cells Using. Science (New York, N.Y.) 343, 80–84 (2014)