What I'd like to do today is to talk about two different very distinct topics but fall under the umbrella of how we can use metabolite profiling to answer important biometrical questions
So, the first part of my talk, I'm going to just talk about screening for in-borne errors of metabolism and how untargeted metabolite profiling can change how we do things in--currently with respect to the current approach to targeted identification of just a few of these very rare conditions
The possibility opens for looking at the broad array or ideally all in-borne errors of metabolism so we can make treatments that save babies' lives in early days
And with that, I'm going to give you just an example of one in-borne error of metabolism. We've looked at a few of these things. But, I think it's possible to do a more generic kind of screen to approach all disease
And then I'd like to talk about--so, that was a genetic defect, a mutation, so understanding the consequences of mutation
And then I'd like to talk about how we can understand the consequences of drug treatment. So, we give drugs, and we--you know, we do so for given therapeutic reasons. We sometimes think we know what these drugs do. Often, we really don't know
And the application of untargeted metabolite profiling offers us the opportunity to really understand the systemic consequences of what drugs do
And one has to appreciate that it's a push-pull thing. So, even if you have a drug that is exquisitely specific for a given protein, perturbing that protein will impact broadly on all those enzymes that--and substrates and products that connect with that enzyme
And often, the consequences of a drug action are not exactly what you might've predicted, and I'll give you an example of such
So, I apologize. I'm going to squeeze a lot of stuff in a small amount of space. So, bear with me
So, just to introduce this, I'd like to just say a few words about the workflow. So, we start, of course, with LCMS data acquisition. The workhorse in my lab is the Agilent TOF, a--we use the 6224 TOF
It's very important to do--well, we think it's very important to do TOF-only analysis in the first measure because we--in the first instance, we really want to know what's differentially expressed
And in the second instance, we want to know what it is. And the what it is piece, you know, we'd like to identify quickly by database searches. When necessary, one does MSMS. And this in our lab is done on a 6540 QTOF
So, we take raw data. Next step is molecular feature extraction. So, this takes all the different species that are observed and groups them--groups the isotopic cluters, the dimers, the ad hoc. So, basically, we're looking to assemble all the peaks that represent a single molecule
And then we take those features and align them with respect to retention time and mass so that we can do, next instance, some chemometric analysis, using analysis of variance to look for what's differentially expressed between a group of samples and a group of samples, perhaps from an animal with a gene mutation or from an animal that receives a drug or a patient for that matter
And then once we've done this chemometric analysis, a very important next step is to do what we call recursive analysis, where we go back and look for things that weren't found in the sample set in the first pass
So, sometimes the computer makes a mistake, misassigns an M over Z to the plus one of another molecule, when it really belongs--is the M zero for a given molecule
So, we go back and look for it. And in the end, we're taking a very rigorous stand to data interpretation, whereby we typically only consider molecules that we find in every member of a given treatment group
So, that's, of course, a very restrictive case. We could lower the bar. But, that makes the best quality data in the first pass
And then, once we've found things that are differentially expressed, we try to identify them based upon--we use Gary Schuzdeck's [sp] metlin [sp] database that's annotated in house with retention times and take advantage of MSMS fragmentation to make rigorous identification
And in the end, we're typically wanting to integrate these data in terms of pathways to understand what patterns of change have occurred in response to a given gene mutation or drug for this purpose of this presentation
In order to do this, it's important to have very robust LCMS data. So, if the technical variation between measurements is high, it's very--it's impossible to see significant biological variation
So, this is taking a single sample of human plasma, injecting 0.2 microliters of it 56 times, and analyzing it by aqueous neutral phase chromatography coupled to positive ion monitoring mass spectrometry
And what you're looking at here are the 56 overlays for the total ion counts. And you'll see that the overlay is quite tight. Oops. I'm not sure why that advanced
And in this panel B we've taken for these 56 repeat measurements--I'm not doing that--we've extracted 615 different metabolites and over--and set them all--the abundance of them to zero
So, appreciate that some metabolites might have 1,000 ion counts associated, some 1 million. So, we just normalize to set them to zero and then look at the variance over the 56 measurements. And you'll see that there's very tight variation
Again, it's getting away from me. And in the lower panel, just as an example, we've extracted--we've looked at some--among those 615 metabolites, we've picked a few of them to overlay the peaks that correspond to their measurement
And you could see--and these--each one is 56 overlays. And you can see that it's very reproducible
So, this--the--what we take home from this is that one can --the technical variation is quite tight. So, we can really get--hone in and discover meaningful biological differences in terms of treatments and mutations
So, that was just aqueous neutral phase chromatography with positive ion monitoring. To get the greatest coverage, we routinely perform both reverse phase and aqueous neutral phase chromatography because reverse phase separates things based on hydrophilicity, and aqueous neutral phase separates things based on--I'm sorry, hydrophobicity for reverse phase and hydrophilicity for aqueous neutral phase
And then we separately look at the positive ions and the negative ions. And here's the Venn diagram summarizing the data from, again, 0.2 microliters of plasma injected, analyzed by these four modes
And in aggregate, we can quantify with 100 percent frequency 3,715 features in the plasma
All right. So, now, turning this technology to address the problem of in-borne errors of metabolism, so what are in-borne errors of metabolism? These are very rare monogenetic diseases that result from a single enzyme, typically an important enzyme that leads to a downhill cascade of events for the poor baby that suffers from this condition
Importantly, once recognized, many or most in-borne errors of metabolism can effectively be treated by modifying the diet of the baby or providing--and/or providing some supplement
It's critical, though, that early diagnosis is made because, if the modification of diet and supplementation doesn't occur early on, this can often lead to irreversible brain damage. And so, the sooner the better in terms of the--making the diagnosis
In the United States, there's--the most--the greatest number of in-borne errors of metabolism that are studied in various of the 50 states is 30. It's measured--the requirements are state by state. West Virginia looks at one thing
But, in actuality, we know that there's 400--we know of 431 different genetic in-borne errors of metabolism. There are probably 20,000. You know, there are over 20,000 genes. And there are probably over 20,000 things that can go wrong. So, there are many conditions that we just don't know about
These--I emphasize that these are very rare conditions. And it becomes a big problem to look for them. And that's why we don't
But, the importance of looking for them is depicted in this slide, where we're looking at three sisters who all suffer from the same genetic defect. It's gluteric acidemia type I
The two sisters that are flanking the one in the center benefited from early diagnosis and dietary manipulation. The middle sister was not checked. And the consequences is obvious
So, there are many conditions like this where early knowledge is critical for changing the future of the afflicted individual
So, what are the limitations for screening for newborns for in-borne errors of metabolism? First of all, the cost. Each condition is rare, anywhere from one in 5,000 to one in 250,000. So, it'd be far too expensive to test for all these different diseases
And that is even if the assays were available. And they're not all available
Secondarily, it's important that you screen early. I mentioned that one has to make treatments early. And in the best of worlds, so the mom and baby in the U.S. typically leave the hospital at day three
Typically--so, in the best of worlds, we'd like to know by day two to--you know, that such-and-such a condition exists so that we can take appropriate action
And then--so, we'd need an assay, not only that surveyed all these different molecules but in addition can be reformed very rapidly and with clinical findings, very rapid
And then a final requirement is that of sample availability. So, babies just don't have enough blood to enable testing of all these things one by one, even if we had the money and the time. So, you'd have to squeeze the baby dry in order to do this
But, what's currently done is a heel stick. And so, standard in many places is to take a drop of blood and dry blood on a Guthrie card. And that's used for MSMS with TripleQuad. And an example of such an analysis is depicted on this slide
And this--you know, so, MS, of course, is very powerful technique, or tandem MS is a very powerful technique. It's exquisitely specific, highly sensitive
And an experiment was done in 2003 in Germany to test whether--how much benefit we get from doing this kind of testing by MSMS, by tandem MS
So, the--in this study, involved 250,000 neonates that were screened for 23 different in-borne errors of metabolism. The overall sensitivity was 100 percent, as good as it gets
So, and--so, of those 250,000 neonates, 106 were confirmed to have an in-borne error of metabolism. Seventy of them benefited from treatment
The problem is that, while the specificity is very high, 0.33 percent were false positives. Now, 0.33 percent on such a big number like 250,000 translates to 825 false positives. So, for 100 true positives, you have 800-plus false positives
And the problem with this, as you'll see is that you look for one thing. So, with current testing for in-borne errors of metabolism, for those things we look at, we look for one molecule
The reality is that perturbation of any given enzyme causes vast and diverse changes, as you'll see in the example I show you. And it's not one thing that changes. It's many things. And if you could--if you honed in on these many things, there would be no false positives I assert
All right. So, the potential strengths for untargeted metabolite profiling that address these limitations are coverage. The multiplex--so, screening can look for--we can try to look for everything that changes, comparing a set and a set of plasmas
I showed you a moment ago that we can cover over 3,700 different molecules. The good news is--or for us, the bad news for the baby is that when there's an in-borne error of metabolism, typically, molecules increase in concentration, or at least things that are upstream of the genetic defect
So, it's not a problem typically of looking at low levels. It's really the issue of looking at high levels
Because you can cover, in theory, many, many things simultaneously, the problem of specificity goes away. This false positive rate should disappear because we only require a couple microliters, less than what we currently drop onto a Guthrie card when we do these heel sticks for the babies
The sample is adequate to do this very broad profiling, which I think is the way we'll go in the future
The screening could be completed in perhaps 30 to--30 minutes to an hour. So, this would allow for at least potentially the--making the diagnosis before the baby and the mom leave the hospital
The cost for such analyses would be low after the initial investment--this is on autopilot--in hardware and software. I have to fight with this
And from a science viewpoint, it's possible to establish new knowledge. So, each baby really is an experiment of nature. And when you have a mutation in a given enzyme, we can learn the systemic consequences of that
So, beyond learning something about making a diagnosis for improved therapy, we can learn more about human biology from these--performing these kinds of profiling experiments
So, I had a few proof-of-principle experiments to show you. I'm going to show you one, given the time limitation
In a collaboration with Telia Wargoal [sp] in the Metabolism Lab at Columbia University, which takes in many high-risk mothers, we've been provided with some very rare plasmas that come from babies with in-borne errors of metabolism
The one I'm going to talk about is argininosuccinate acidurea [sp]. So, argininosuccinate lyase [sp] is a urea cycle enzyme. The urea cycle serves to detoxify ammonia
The babies that are born with these urea cycle defects are unaffected at birth because mother takes care of the detoxification of ammonia while in utero
However, a few days after birth, babies develop hyperammonemia, ketoacidosis, vomiting, respiratory distress, and often go into a coma and possibly die. They actually die of mother's milk
So, we have a colony--we have--I've worked with mice that have a urea cycle defect. And when they're born, if you separate them from the mother and don't let them suckle mother's milk, they'll live for a couple of days. If you let them suckle mother's milk, they die within four hours
So, it's the amino acids in the protein that make the ammonia that makes the toxicity
The prevalence of argininosuccinate lyase deficiency is one in 70,000 and if left untreated results in death
The treatment is a low-protein diet with arginine supplementation. Ultimately, there may be a liver transplant
So, this depicts the urea cycle. There's five enzymes in the urea cycle. Two of them are in the mitochondria. Three of them are outside of the mitochondria, as is argininosuccinate lyase
With an argininosuccinate lyase deficiency, argininosuccinate accumulates, as shown here. And that's the basis for the diagnosis using classical MRM approaches
Here, we're looking at untargeted metabolite profiling performed on the blood of--plasma from one baby that has an argininosuccinate lyase deficiency. It's analyzed three times shown here to the right of this hierarchical clustering analysis
Each little tick mark is a--this is a heat map--is denoted by heat mark, heat map from red hot to blue cold. And each one represents a feature
This surveys 1,185 features by aqueous neutral phase chromatography with positive ion monitoring
This other group represents healthy babies. And you can see a black-and-white difference that the healthy babies are completely different from this one sick baby
Now, this is only one gene mutation. But, there's profound changes in the metabolites in the plasma
This is a principle component analysis comparing that sick baby, the argininosuccinate lyase deficiency, with healthy babies. And you can see that there's a clear separation on the first principle component
First principle component accounts for 90 percent of the variation on these data
So, what are these differences? So, the way you can look at what actually contributes to the separation in the principle component analysis is doing by--is by doing a loadings plot. And that's shown here
One would be the maximum contribution and on this axis goes down to 0.4. So, these are things that contribute differentially to that loading plot. But, the things at this end contribute most profoundly. And what we find is that this includes argininosuccinate and arginino--and citraline [sp]
In red is the results, the analysis from the sick baby. In black, flat line here, is from the healthy baby. So, there's no argininosuccinate in healthy baby plasma. There's a little bit of citraline in healthy baby plasma, but there's far more with argininosuccinate synthase [sp] deficiency
The rationale for citraline accumulating is shown here. If you put a block in argininosuccinate lyase, well, you build up argininosuccinate, but potentially you can back up these other precursors to argininosuccinate. And in fact, what you do find is abundant amounts of citraline
So, if you looked at citraline plus argininosuccinate, you'd have a much better measure--thank you. You'd have a much better measure or means to make a diagnosis in this case
But, what about all that other stuff? How does that fit in? So, this is--I find this to be rather remarkable. And it speaks to the system's biology
So, we find, for example, abundant levels of these molecules, which are perimidines [sp], ionocine [sp], hydrouracil [sp], deoxyonicine [sp]. Why are they there? Well, if you can't get rid of ammonia in the--by the urea cycle, carbomilphosphate [sp] is synthesized from the ammonia in the cytosol [sp]. So, there's two carbomilphosphate synthetases [sp]. One of them's in the mitochondria. One is in cytosol
So, if you don't deal with it in the mitochondria, you deal with it in the cytosol. And carbomilphosphate adds to aspartic acid to make urotic [sp] acid, which is the precursor to all perimidines
And we reconcile the finding of these perimidines as a consequence of not being able to do the urea cycle efficiency--efficiently
Okay. So, just to summarize here, and I'll jump to the next piece, we think that untargeted metabolite profiling offers extremely powerful approach that can be developed to broadly screen for rare in-borne errors of metabolism without looking at individual conditions. We look at what's differentially expressed
The advantages over current state-of-the-art MSMS for in-borne error metabolism screening is that this could be done. It's low cost. We can have much increased confidence and is a very low sample requirement to accomplish this
It also offers the potential to expand our knowledge of protein functions. So, as I said, each baby is an experiment of nature. And we can learn about human systems biology by studying these individuals
Untargeted MS data can enable medical discovery. So, it can use for diagnosis, but at the same time, it can inform us so that we understand better the functions of proteins
So, we become diluted by the annotation of proteins. Proteins--you know, in 70 percent of proteins in our database are annotated. A lot of them are wrong. And many--most of them probably are incomplete. So, we have a lot more to learn about what these proteins do
And finally, in-borne error of metabolism screening is a logical first step for the introduction of untargeted metabolite profiling into routine medical care in future, undoubtedly. When you go to see your physician, you won't get a 20-channel CBC, where, you know, you get some cardiac enzyme, a liver enzyme, a lipid profile. Why not learn as much as you can? Why not do a comprehensive screening? It makes sense to apply this to babies when they're just born. And this--and if--and once that is ingrained in our culture, I think it will extend to all medical care every time you see your physician. You'll have another measurement. And this will--you'll gauge your lifetime in terms of metabolism throughout all of your travails as you grow up
So, to conclude, untargeted metabolite profiling offers a powerful systems biology approach for discovering the actions of genes and drugs
I hope I've made that case to you. And I really need to recognize the person who did all the work. She's sitting right there. This is Chew Ling Chen [sp]. She's the one who made all these experiments. I'm only here to gloat about it in her presence
And I thank you for your--