Genes, for the most part, encode proteins. Thanks to the universality of the genetic code, a human protein expressed in E. coli will have the correct polypeptide sequence. But purify that protein and try to make it work in a test tube, and there’s a good chance you’ll fail, thanks in part to post-translational modifications.
Most eukaryotic proteins are controlled by a series of chemical switches. Therefore, naked protein sequences (at least those that are incompletely or improperly outfitted) either won’t work at all or won’t work as well as those that are properly modified. Many such switches exist. Researchers are well aware of phosphorylation, for example. Another of the most prevalent chemical modifications is glycosylation, the coupling of proteins to complex sugar chains.
For many researchers, glycosylation can be a tough nut to crack. “People are used to sequence,” says Carlito Lebrilla, distinguished professor of chemistry at the University of California, Davis. Gene sequencing is trivial these days, and the tools for manipulating DNA and RNA are well established and ubiquitous. Given a nucleotide sequence, researchers can predict the sequences of the proteins that actually carry out that DNA’s instructions.
But sugars are a far more complicated problem, Lebrilla explains. There is no known template that specifies how sugars will assemble, for instance, and a given protein may contain multiple glycan forms in several locations, producing a heterogeneous set of proteins, each of which may function differently but all bearing the same base polypeptide sequence.
Glycan chains can be assembled from a variety of monomer subunits and form complex branching structures because of their multiple linking positions (as opposed to proteins, which always have a single kind of linkage). The result is a complex polymer assembly more akin to a bush than a chain, says Vern Reinhold, a structural glycobiologist at the University of New Hampshire, whose lab develops methods to decode these bushes, twig by twig.
Add in the unfamiliar language of sugar chemistry and the relative paucity of user-friendly tools, and it’s no wonder many non-specialists are intimidated when it comes to analyzing the carbohydrate components of their proteins, says Paula Magnelli, a glycobiology research scientist at New England Biolabs. “Although it requires expertise, a glycan sample can be prepared for analysis using only benchtop equipment and consumables,” Magnelli says. “But after that, you don’t have a lot of tools to elucidate the structure of the sugars, and you will likely need to send it to a third party.” Online and off-line educational resources are expanding, however, and so is the analytical toolbox.
Deciphering a given protein’s glycosylation pattern in detail is a massive undertaking requiring, among other things, some seriously advanced mass spectrometry. If that’s your goal, there’s simply no getting around it (more on that later). But non-MS experts can tease out quite a bit of information at the bench, too, sometimes by simply using gel electrophoresis. (See Sigma-Aldrich’s Glycobiology Analysis Manual for a fairly comprehensive overview of options.)
For instance, glycoproteins are always heavier than their unmodified counterparts and they are often heterogeneous, as well. So if you suspect your protein might be glycosylated, run it on an SDS-PAGE gel, says Lebrilla. If the band is either fuzzy or larger than anticipated, it could be glycosylated.
You can also use glycosidases to clip the glycans from the protein and use protein gels to compare before and after weights. PNGase F (available from Sigma-Aldrich and New England Biolabs, among others) is a common choice, used to remove N-linked glycans. (Sugar moieties can be either N-linked, connected via asparagine residues, or they can be O-linked, connected through serine and threonine.) Removing O-linked glycans is trickier, as it usually requires a chemical reaction (though O-linked glycosidases, such as O-glycosidase). But most proteins with O-linked sugars also contain N-linked sugars, says Lebrilla, so PNGase F should work in either case.
For a global view of glycosylation, glycoprotein-specific gel stains are available. These include the fluorescent Pro-Q® Emerald 300 Glycoprotein Stain Kit from Life Technologies and the colorimetric Pierce Glycoprotein Staining Kit from Thermo Fisher Scientific.
For more specific glycan questions, lectins are like glycan-specific antibodies that can be used on Western blots to determine if a specific sugar is present on a given protein, and Life Technologies’ Click-IT® enzymatic labeling system can label cells or proteins containing a specific sugar (such as O-GlcNAc or mannosamine). Alternatively, there are the 2-AA and 2-AB GlycoProfile™ labeling kits from Sigma-Aldrich, which tag sugars with fluorescent moieties for either direct detection or improved mass spectrometric ionization.
Finally, glycosyltransferases enable researchers to specifically tag proteins with given glycans. Sigma-Aldrich offers two new sialyltransferases, for instance, which can be used to couple sialic acid moieties to the ends of pre-existing glycans. Say, for instance, that you express a protein that fails to work as expected, says Robert Gates, market segment manager at Sigma Life Sciences. “They may need a sialic acid on the end, and this enzyme can add those.”
Ultimately, all that the tools described above will tell you is whether or not a given protein is glycosylated, plus a bit of information about how that glycan is coupled to the protein (O- or N-linked) and what sugars it may contain.
To really delve into glycan composition, you’ll need mass spectrometry. (A recent review in Analytical Chemistry, by Joseph Zaia at Boston University, is a good starting point.) Depending on the question, says Magnelli, who is developing workflows to simplify glycan sample preparation and carbohydrate analysis, researchers generally use either a MALDI-TOF to acquire a general overview of the sample’s glycan composition, or electron ionization with an ion trap for more detailed structural analyses.
Sometimes researchers need multiple instruments to complete their analyses. The Analytical Services laboratory, at the Complex Carbohydrate Research Center (CCRC) at the University of Georgia, has 14 mass spectrometers, says technical director for analytical services, Parastoo Azadi: three Thermo Fisher instruments, comprising an LTQ Orbitrap (ion trap-Orbitrap hybrid), an LTQ and an LCQ Advantage; three MALDI instruments (one from Bruker Daltonics plus a 4700 MALDI and a 5800 MALDI from AB SCIEX); seven GC-MSs (six from Agilent Technologies and one fromShimadzu); and an Extrel Pyrolysis-Molecular Beam MS.
Reinhold’s lab uses three mass spectrometers for his studies, but principally uses a Thermo Fisher LTQ ion trap instrument that is required to unravel the glycan’s structural details. Reinhold’s research focuses on the minutia of glycan structure. First, he stabilizes the glycan by chemically blocking all unlinked positions (which excludes those involved in subunit linkages), a process called permethylation. Then he stepwise disassembles the methylated glycan in the ion trap, which exposes former linkage positions, called “scars.” This “scarring” can be mapped to the starting structure, but more importantly, the product fragments can be disassembled into ever smaller pieces to expose their molecular details (a strategy called MSn). “Unless you pursue these details, you could miss an opportunity for discovering a biomarker,” Reinhold says.
Azadi focuses on glycoproteins. Given an unknown molecule, Azadi says her typical strategy is first to determine its monosaccharide composition by hydrolysis and HPLC. “That tells me how much mannose, galactose, sialic acid and N-acetylglucosamine” are present, Azadi says. Then, she turns to mass spectrometry, using MALDI technology to measure intact glycan size and the high-resolution, high-mass-accuracy LTQ-Orbitrap MS/MS to dissect and sequence the oligosaccharides and more. If necessary, Azadi also can map linkages using permethylation modification and gas chromatography-mass spec.
Generally speaking, glycan analysis uses some form of liquid chromatography on the front end. Lebrilla’s laboratory has several such systems, including basic HPLC, UPLC and a nanoLC—an Agilent quadrupole-TOF with a microfluidic HPLC-Chip interface on the front end.
The HPLC-Chip is a microfluidic, nanoflow liquid chromatography interface to the MS, and according to Keith Waddell, director of applications marketing at Agilent, one of its primary advantages is that it is easy to change column chemistries. “It’s easy to change one column configuration for another just by swapping out the chips,” Waddell says—for instance, from C18 to porous graphitized carbon (PGC).
For those studying monoclonal antibody glycosylation in particular (the biologics therapeutics market, for instance), Agilent offers an mAb-Glyco Chip Kit that performs on-chip PNGase F cleavage in two minutes, followed by sample enrichment and fractionation on tandem PGC columns.
One popular MS strategy for glycoprotein analysis uses the Thermo Fisher Orbitrap Elite with optional ETD (electron transfer dissociation). Most mass spectrometers fractionate proteins using collision-induced dissociation (CID). CID generally works fine for protein sequencing, but it tends to fracture the most labile bonds first; in the case of glycoproteins or glycopeptides, that usually is the bond connecting the glycan to the protein backbone. As a result, any information about glycan attachment is lost immediately.
ETD is an alternative non-ergodic (as opposed to energetic, like CID) fragmentation method that tends to leave post-translational modifications in place. Given a glycosylated peptide, an ETD-enabled mass spectrometer will continue to systematically cleave the peptide backbone, making it possible to sequence glycopeptides fully and to identify the site of glycosylation at the same time.
“ETD is the only method which will give unambiguous identification of the site of glycosylation,” says Rosa Viner, marketing manager at Thermo Fisher Scientific.
According to Viner, the preferred method of glycoprotein analysis on the Orbitrap Elite is called HCD-product dependent ETD. HCD is a collisional fragmentation method. In this approach, HCD is used to query individual peptides one at a time; upon encountering a peptide that releases a particular diagnostic fragment indicating the presence of a hexose sugar (m/z 204.086), the instrument switches to ETD mode to sequence the peptide.
Arm yourself with knowledge
According to Magnelli, the first and perhaps most important item in any budding glycobiologist’s toolbox is information. A good starting point is the free e-book “Essentials of Glycobiology” (2d Ed.), from the National Center for Biotechnology Information (NCBI). Other useful resources include the websites of the Functional Glycomics Gateway and theCCRC.
To interpret glycan mass spectra, you’ll need a database or library to match chromatographic retention times and masses with structures. Several such resources are available, including at the ExPASy bioinformatics server, EuroCarbDB and the Functional Glycomics Gateway Glycan Search. Still others are in development. For instance, Lebrilla, in collaboration with Agilent’s Waddell and others, has built and characterized libraries of N-glycan fragments from the serum proteome, cow milk, human milk and other samples. “We have maybe 150 structures, and we’re close to 500 entries,” Lebrilla says.
Finally, for some real hands-on education, the CCRC offers a series of one-week training courses on such topics as polysaccharide characterization, glycoprotein and glycolipid characterization, glycosaminoglycans and mass spectrometry. Each course can accommodate about 22 students and costs $500 for academics, says Azadi.