Protein phosphorylation is a significant post-translational modification (PTM) implicated in numerous cell regulatory processes, for example in the activation/deactivation of enzymes and receptors. Phosphorylation and dephosphorylation are catalyzed by kinase and phosphatase enzymes, respectively.

“Where proteomics is the large-scale approach for directly studying protein abundance and co-interaction, phosphoproteomics focuses specifically on characterizing the proteins and specific sites involved in phosphorylation-based signaling,” explains Aaron Bailey, Ph.D., product manager for mass spectrometry services at BGI Americas.

Since phosphoproteins are present in lower abundance than nonphosphorylated proteins, the tool set for analyzing the phosphoproteome is also a subset, mostly limited to methods involving mass spectrometry (MS). Even those methods must be optimized for sensitivity and selectivity, given the often low phosphorylation stoichiometry diagnostic for many interesting biological states or medical conditions.

Approximately half of all animal proteins are phosphorylated, some at multiple sites. For the human proteome, that’s roughly 230,000 potential phosphorylation events to categorize, track, and quantify. Given its low occurrence, often low stoichiometry, and reversibility, phosphorylation is a formidable target for analytical biochemists, particularly given its role in disease and health.

“Any analysis of cellular signaling that does not probe phosphorylation is missing a big piece of the puzzle to explain phenotypes observed,” says Joshua Nathan, senior product marketing manager at Cell Signaling Technology.

For example, dysregulation of kinase activity or expression is common in cancer, and phosphorylation generally has been the target of many approved drugs, hundreds of medicines under development, early drug discovery, and diagnostics. Since phosphorylation status lies outside the capabilities of genomics or transcriptomics, phosphorylation data complement those provided by genomics.

Since raising antibodies specific to phosphorylated or nonphosphorylated forms of a protein, for purposes of developing ELISA-based assays, is time consuming, investigators usually turn to phosphoproteomics—the application of mass spectrometry to quantify phosphorylated peptides from a tryptic digest of tissue or cells.

Gary Kruppa, vice president of proteomics at Bruker Scientific, notes the many challenges of phosphoproteomics, the foremost one being sample preparation.

“To ensure preservation of a sample’s phosphorylation status, the sample is typically treated with a cocktail of phosphatase and kinase inhibitors during cell lysis or blood draw, then subjected to trypsin digestion to yield peptides that are amenable to standard shotgun proteomics methods.”

Enrichment issues

As with proteomics generally, analyzing the phosphoproteome is complicated by the number and wide concentration dynamic ranges of proteins in cells or plasma, and further confounded by substoichiometric levels of phosphorylation. Investigators therefore enrich target phosphopeptides through affinity enrichment e.g. immobilized metal affinity chromatography, titanium dioxide, or phosphorylation-specific antibodies.

According to Bailey, challenges for generating “rich” phosphoproteomic data include enrichment of phosphopeptides across a wide concentration dynamic range, accurate quantitation of each of these modification events, and confident localization of the exact sites of phosphate additions, particularly for amino acid sequences containing multiple serine, threonine, and/or tyrosine residues.

On an even more basic level, target enrichment demands significant quantities of starting protein, which is tough to come by when sample is limited, Kruppa says, “and even after enrichment the digests are still quite complex. Phosphopeptide analysis therefore requires long chromatographic runs using nanoflow LC coupled to high speed, high-sensitivity MS.”

Moreover the enrichment process itself, based on the affinity of phosphopeptides for metals, complicates downstream analysis since samples are exposed to metal surfaces during the run.

“Amino acid phosphorylation dramatically alters the efficiency of trypsin cleavage at nearby sites, resulting in uncleaved linkages and larger peptides entering the analysis stream. “Fragmenting larger peptides can be challenging, but efficient fragmentation is essential for identifying the peptide and phosphorylation site,” Kruppa says. “Larger peptides commonly consist of multiple sites for phosphorylation and other PTMs. The need for enrichment, combined with these factors, makes determination of phosphorylation stoichiometry particularly difficult.”

Enrichment strategies have their advantages and drawbacks. “IMAC yields large datasets but can be unfocused with respect to sites identified,” says Nathan, “which means you might miss critical, lower-abundance regulatory sites, particularly phosphotyrosine. More specific enrichments using antibodies directed to phosphotyrosine or canonical serine or threonine kinase substrate motifs can be used to more specifically enrich peptides containing sites of interest, which allows a more limited subset of phosphopeptides presented to the mass spectrometer and a better chance of capturing low abundance/low stoichiometry sites.”

Site-specific antibodies can also be used as enrichment tools either singly or in combination to create fixed lists of sites to profile across samples. “These enrichments lend themselves to targeted assays,” Nathan adds, “where absolute quantification of a given peptide or group of peptides can be performed across an unlimited number of samples.”

Overcoming the hurdles

LC-MS companies are doing their part to overcome these challenges through strategies that minimize phosphopeptide sample losses to metal surfaces, and by constantly improving the capabilities of MS in shotgun proteomics.

“By continuing to increase analysis speed and sensitivity, it is now possible to identify more phosphopeptides than ever in a typical ninety-minute LC/MS run,” Kruppa tells Biocompare. Bruker’s timsTOF Pro MS system—taglined “the new standard for 4D shotgun proteomics,” with the company’s nanoElute LC, a recently introduced shotgun proteomics platform, is one such system. The MS component uses a trapped ion mobility (TIMS) analyzer, which accumulates ions and separates them according to their collision cross sections (CCS), providing an additional separation mode. CCS is an additional physical parameter used for peptide identification, that can distinguish between two identical sequences phosphorylated in different locations.

“These positional phosphorylation isomers have the same sequence and mass-to-charge and, if they co-elute, cannot be characterized without the trapped ion mobility separation,” Kruppa explains. “TIMS also provides greatly enhanced sensitivity, requiring smaller injected samples and hence significantly less sample.”

The Bruker system also incorporates parallel accumulation serial fragmentation (PASEF) for greater analysis speed, with the benefit of targeting a larger number of phosphopeptides per run. According to Kruppa, libraries of experimental CCS values will soon be available to add confidence to phosphopeptide identification, and several groups are applying artificial intelligence and machine learning on known peptides to add predictive value for unknowns.

Perhaps the greatest hurdle to routine phosphoproteomics are inconsistencies in tryptic digest efficiency, which introduces uncertainty in determining the stoichiometry of phosphorylation. As Kruppa points out, phosphopeptide enrichment cannot be used to determine stoichiometry. “For such experiments, labeling techniques such as SILAC have been proposed to compare phosphorylation levels in differently treated samples. However, a more direct approach for completely avoiding complications introduced by enrichment and digestion is to measure intact proteins or larger peptides produced by digestion with enzymes other than trypsin, known as top-down and middle-down approaches, respectively.”

Top-down phosphoproteomics for complex samples suffers from limited dynamic range. “The middle-down method shows promise, but has not been widely explored,” says Kruppa. “So while either of these methods are unlikely to replace the more commonly used enrichment and tryptic digestion strategies for extensive identification of phosphorylation sites in the near future, they remain interesting.”