Genomics in the post-genome era

In April 2003, coinciding with the 50th anniversary of the discovery of the structure of DNA, the Human Genome Project was completed. Launched in 1990, the project sequenced the entirety of the human genome with 99.9% accuracy, and shortly afterwards, the authors published the first human reference genome, providing “an essential foundation for the sequencing and analysis of additional large genomes”. This ushered in the post-genome era, where the widespread availability of entire genome sequences for human and many other reference organisms could now answer questions about DNA function at the level of genes, RNA, and protein, bringing about a paradigm shift in the approach to genetics, healthcare, and drug discovery.

Enter functional genomics

Functional genomics makes use of the myriad information generated from genomes, aiming to determine the relationship between genotype and phenotype, but crucially, addressing this on a genomic scale. The basic aim of a genomic screen is to modify gene function and see what happens to the resulting phenotype. Simple phenotypic changes can be screened, such as cell viability or proliferation, or the more complex expression of cell surface proteins. A classic genetic screen is said to be in the forward direction—the expression of multiple genes is altered, clones displaying the phenotype of interest selected, and the responsible genetic change identified. This is said to be a phenotype-to-genotype approach, in contrast to a hypothesis-led reverse screen, usually performed ‘gene by gene’. In this case, specific genetic changes are made to genes of interest to generate a highly redundant mutant population, followed by assessment of the resultant phenotype.

In the post-genome era, we now have the genetic information required to perform genome-wide forward screens by systematic loss-of-function studies. The application of a selection pressure following gene knockout, for example treatment with a drug or alteration to growth conditions, could then identify those genes that confer sensitivity or resistance. Functional genomic screening could thereby highlight the contribution of certain genes to the relevant phenotype and allow the intricacies of cellular pathways, disease states, and drug target identification to be discovered.

Going forward for the next-generation

Historically, forward genetic screens were difficult to perform on a large-scale. Gene knockout by random mutagenesis was largely inefficient and did not lend itself to a genome-wide, systematic loss-of-function study, and the identification of the causal mutation by Sanger sequencing came with significant cost. Some phenotypes required a true null allele, and efficient knockdown of both copies of the gene in diploid genomes was also problematic. The use of viruses and transposons with defined insertion sequences aided efficiency, but these methods were difficult to scale for high-throughput analysis of a genome.

From its introduction in 1977, Sanger sequencing dominated the field for 30 years, but the time and cost implications made it impossible to identify the mutations arising from a large-scale genetic screen. The rise of next-generation sequencing (NGS), with its miniaturized sequencing reactions and improved detection systems, meant that massively parallel sequencing analysis could now take place at high-throughput at much reduced cost.

The discovery of RNA interference (RNAi), an endogenous cellular pathway conserved across eukaryotes, provided the ability to perform targeted disruption of gene expression for use in genome-wide genetic screens. RNAi functions to degrade mRNA molecules and therefore abrogate gene expression, playing an important role in cellular development and defense against viruses. Following work first published in 1998 in C. elegans, the RNAi pathway was quickly harnessed for gene silencing of targeted sequences. Knockdown occurs following the introduction of small-interfering RNA (siRNA) or short-hairpin RNA (shRNA) molecules into cells, which then target mRNA molecules with complementary sequences for degradation, thereby ablating gene function.

The CRISPR revolution

More recently, a new gene disrupting technology has emerged repurposed from a bacterial adaptive immune system. The CRISPR-Cas9 system utilizes the Cas9 endonuclease, which is guided to the target sequence by a short guide RNA molecule (sgRNA), where it introduces a double-strand break at the desired loci. Activation of cellular DNA repair pathways results in a disabling deletion and subsequent gene knockout. Gene editing by CRISPR-Cas9 requires only the synthesis of the sgRNA, containing a 20-nucleotide portion complementary to the target site, and delivery of components by standard molecular biology techniques. Combining the programmable nature and ease of design of RNAi with a permanent mutagenic capacity of an endonuclease, the CRISPR-Cas9 system displays robust rates of gene knockout with reduced off-target activity compared to RNAi.

Three key takeaways

  • Functional genomics aims to determine the relationship between genotype and phenotype but on a genomic scale, which was previously difficult
  • The CRISPR-Cas9 gene editing technology provides the means to perform genome-wide loss of function studies
  • Genome wide sgRNA libraries can be delivered to cells by lentiviral vector

Since it was first shown to induce precise cleavage at specific loci in the mammalian genome in 2013, the CRISPR-Cas9 system has exploded into the biological space and successful gene editing has subsequently been demonstrated in a wide variety of organisms, cell lines, and animal models. Large libraries of sgRNA can be manufactured by array-based oligonucleotide synthesis; given the inherent scalable design of the system, it was inevitable that work moved onto assessing its use as a functional genomic screening tool. Within a year of the first demonstration of successful gene editing in mammalian cells, two papers (Shalem et al. and Wang et al.) were published simultaneously showing the successful application of CRISPR-Cas9 as a genome-wide screening tool.

Using essential genes as proof of concept, the two studies demonstrated that CRISPR-Cas9 could be used to elicit genome-wide, specific gene knockdown, and identified genes that conferred resistance to chemotherapy agents 6-thioguanine and vemurafenib. Employing methods developed for functional genomic screens using RNAi, lentiviral vectors delivered a genome-wide sgRNA library —the stable integration of the transgene by lentiviral vector effectively provides each cell with a barcode, so the screen can be performed in a pooled format. Following the screen, the sgRNA that were enriched or depleted can be easily identified from the pool by NGS, providing a means to perform both positive and negative screening.

The future is here

Genetic screening seeks to understand the function of a gene by linking genetic sequence with biological phenotype. With CRISPR-Cas9 and NGS sequencing combined, scientists now have unmatched screening power, both in terms of the adaptability and scale of the screens possible, and the speed and depth with which hits can be identified. Technological developments continue, with screens performed in primary cell lines and animal models. The gene-editing repertoire of CRISPR-Cas9 has been further increased with the use of a catalytically inactive form of Cas9 (dCas9) fused to DNA effectors, allowing genomic changes other than loss-of-function to be made and used in a screen. Functional genomics arose in the post-genome era to use and understand the massive amount of sequencing data in a biological context, and functional genomic screening is now providing important insights into biological systems, mechanisms of disease, as well as furthering drug discovery.