Proteogenomics (PG), a form of multi-omics involving the simultaneous characterization of a sample’s protein and genetic constituents, has been theoretically possible for as long as high-throughput proteomics and genomics have existed together. Flow cytometry efficiently separates and quantifies cells according to their proteomic contents, and high-throughput sequencing has been around for two decades. Combining these datasets has always been possible but all that achieves is a comparison of two average values for thousands or millions of cells. Genomics tell us which cells express a gene but not whether that gene has been translated into protein, while proteomics informs on the presence or absence of a protein. Neither method tells us what we want to know about the relationships between the two but PG, particularly of the single-cell variety, does both.

Sample size and availability, perennial issues for omics studies, become limiting conditions for multi-omics. The Clinical Proteomic Tumor Analysis Consortium advises beginning with a 100-milligram sample for tumor proteomics but needle biopsies typically yield 20 milligrams, for which PG analysis is but one destination.

A group at Baylor College of Medicine led by Matthew Ellis, M.D., recently reported on a sample-preparation method that reduces material requirements for PG on tumor biopsies, from quantities typically available only from resection to biopsy-sized samples.

“You can analyze DNA, RNA, and proteins separately, and each will give their own part of the story,” Ellis says. “While sequencing can detect somatic mutations, we don’t always understand their downstream effects, which are carried out by proteins.”

Multi-omic analysis on tiny needle specimens opens the possibility of more broadly applying PG to both basic research and to precision medicine, Dr. Ellis’s main focus.

The case for single-cell PG

Gross analysis of a tumor sample—grinding up all the cells together and applying genomic or proteomic analysis—provides only an average value for the protein or gene of interest in that overall sample. By categorizing cells for both genomic and proteomic content, single-cell PG (scPG) groups cells according to the specific genomic and proteomic markers it holds at the time of analysis. It can, therefore, uncover novel relationships between genes and proteins, including gene silencing, over-activation, tumor escape mechanisms, and other relationships critical to understanding and ultimately treating cancer.

Adoption of PG in routine cancer diagnosis and prognosis will require the emergence of robust, reproducible single-cell methods, according to Andreas Schmidt, Ph.D., CEO of Proteona. “The reasons seventy percent of cancer drugs eventually fail are varied, but tumor heterogeneity is a root cause. Cancer is not a disease where one type of cell grows quickly, and once you address that you’re fine. Multiple dysregulated subclones exist, any one of which might be responsible for relapses.”

Proteona’s specialty, multiple myeloma, is a blood cancer whose very name suggests a disease originating from several distinct cancer lineages. “You need a treatment that kills all problematic cells. If you only kill ninety percent those that survive have a competitive advantage, and the tumor comes back stronger,” Schmidt tells Biocompare.

Nature Methods named single-cell genetic sequencing “Method of the Year” in 2013. Since then single-cell genomic methods have improved, in large part thanks to amplification methods like PCR. In 2018, with single-cell RNA-seq under investigation by hundreds of academic and commercial research groups worldwide, Science honored the method with its own award, “Breakthrough of the Year.”

It took some time for proteomics, for which no analogous amplification technology exists, to catch up with genomics in terms of lowest detection limits and overall robustness, but catch up it did. By 2019 single-cell proteomics had been refined to the point where it became possible to get reliable multi-omics results from a single cell. That method, single-cell proteogenomics was similarly designated “Method of the Year” by Nature Methods.

Whether it is based on single-cell or gross tissue analysis, proteogenomics offers a more comprehensive view of cellular activity than either of its constituent methods—a classic “whole more than the sum of its parts” proposition. Since PG data offers unique insights into a cell’s eventual fate (e.g. cancer or not), how it responds to drugs, and achieves this from each cell in the sample, researchers expect that scPG could serve as a platform for personalized diagnostics and prognostics.

ESCAPE from single omics

“In our model we take a liquid or solid tumor sample of, say, ten thousand cells, and characterize each cell for relevant proteomics and everything we know about its messenger RNA,” Schmidt says.

Proteona’s ESCAPE (Enhanced Single Cell Analysis with Protein Expression) platform interrogates both proteomic and genomic information at their nexus and in real time. The technique uses DNA barcoded antibodies to quantify up to 70 proteomic markers, while simultaneously measuring all mRNAs in the same cell. Artificial intelligence-based algorithms stratify cells according to both genomic and proteomic content, uncovering many unique cell populations contributing to tumor progression or relapse. ESCAPE then queries databases to determine which ones have worked against those specific cell types.

“By combining the mRNA from the same cell fractions it is possible to characterize things for which you have no antibodies. You can even find tumor escape mechanisms, for example surface marker suppression, through the mRNA,” Schmidt explains. A typical experiment targets 4,000 messenger RNAs and 70 relevant proteins.

ESCAPE selects therapies based not on patients or gross diagnosis (e.g. breast cancer) but on the proteogenomic contents of each cell harvested from their tumor. “It asks if we’ve seen this cell before, and if a known drug targets it,” Schmidt explains, “and then asks the same questions to the next cell, and the next.” A proprietary algorithm classifies every unique gene/protein combination and suggests a single drug or combination known to eliminate as many problematic cell types as possible.

Multiple myeloma is an excellent proving ground for ESCAPE as many cells are involved, numerous treatments are available, but long-term remissions are uncommon. Caregivers currently prescribe bone marrow transplants, one of approximately 10 targeted therapies, and about 20 drug combinations in a fashion that Dr. Schmidt calls “more art than science.”

A major issue with omics generally, and especially with multi-omics platforms, is what to do with the vast quantities of data it generates. According to Schmidt, annotating the results of one single-cell study “by hand” can take a Ph.D. scientist the better part of a day. Proteona, whose corporate DNA includes equal contributions from biology and information technology, uses artificial intelligence to make sense of genomic and proteomic outputs.

Whence PG and PM?

Whether PG eventually leads to precision medicine that any health system can afford is a question legitimately asked of every technology promising “the right drug for the right patient at the right time.” At this point scPG is too expensive for routine diagnostics and probably will be for some time.

Schmidt has noted that costs for sequencing have already come down substantially and will continue to fall. Additionally, when developers apply laboratory analytics to diagnostics (as opposed to discovery or “basic science” mode) the molecules of interest are reduced to a manageable number, which is true for both proteomics and genomics. The first step toward adopting scPG as a diagnostic will therefore require validating the method first in high-impact diseases. By that is meant life-threatening illnesses that are expensive to treat, and for which a PG approach makes more sense than either proteomics or genomics alone.

For now that means cancer, specifically tumors for which proteomics or genomics alone do not improve on the extant standard of care or might trigger prescribing a very expensive drug that does more harm than good.

Proteona CSO Dr. Jonathan Scolnick explains that “we regularly see gene and protein expression combinations that show the need for both measurements. In some instances tumor cells do not express detectable levels of a gene that encodes a drug target, for example a multiple myeloma patient whose plasma cells express CD38 protein (whose levels trigger prescribing Daratumumab), while CD38 levels are undetectable. In other cases, we see escape mechanisms such as undetectable BCMA surface protein when the BCMA RNA is clearly expressed. This occurs when BCMA protein is cleaved from the cell surface by gamma secretase, which suggests treating with a gamma secretase inhibitor. These drugs have have side effects so you want to prescribe them only to individuals who will benefit.”

Mass Spectrometry: Integrating Genomics and Proteomics

Proteogenomics (PG) as the study of big biology on very small samples would not be possible today but for improvements in mass spectrometry-based proteomics, particularly in sensitivity and specificity. “MS provides information beyond the protein sequence,” says Emily Chen, Ph.D., senior director of the Thermo Fisher Precision Medicine Science Center. The identification of native peptides, post-translational modifications (PTMs), and proteoforms through MS bridges the gap between genomics and proteomics. “This deeper dive allows us to examine post-translational regulations such as receptor binding, signaling, and protein degradation,” adds Chen.

Traditional biochemical methods use a biased approach to identify proteins, requiring protein-specific reagents, e.g., antibodies, for each molecule of interest. By relying on molecular weights instead of affinity, MS sequences even low-abundance proteins and their fragments with high accuracy, easily distinguishing between native and post-translationally modified peptides.

Thanks to high-resolution MS technologies, sample requirements for the proteomic side of PG have shrunk to low percentages of what was previously needed. This opens up possibilities for integrated molecular-based diagnostics, prognostics, and drug screening. “With samples from human patients, there are no do-overs. So, the equipment and methods used must be sensitive and precise to generate highly accurate data,” says Chen. “Advanced bioinformatics pipelines have also been developed to integrate and visualize PG data.” Chen tells Biocompare. These modern techniques provide molecular profiles of patients at a healthy or disease state, bringing us a step closer toward precision medicine.