Dr. Marilyn Li: First, I would like to thank Agilent for inviting me to share our experience here
As we all know, cancer-associated genomic aberrations, they are important biomarkers. And they are important because they can be used for cancer diagnosis and prognosis
They can be used for disease classification, as you all know. And starting in year 2000, WHO has published a series of publications. And that used the genetic aberrations to categorize the different cancer types. And that's just because its treatment is based on the genetic changes
And they're used for risk stratification and also treatment selection
And this will require us, who are in the genetic community, especially in cancer genetic community, to come up with good diagnostic tools
So, under many genetic technologies can be used to detect genomic aberrations. And the earlier ones would be cytogenetics. And you need to do the chromosomes
And there is a problem with resolution, as we all know. For cancer, the resolution is no more than 10 megabits
And then later on, starting in '90s, and there are FISH study and came onboard and eventually used clinical diagnosis in about--in the middle of '90s. And as you know, we can detect the double minutes and use FISH and homogeneous stained region with FISH
And we can also design probes to detect translocations, deletions, duplications, or use multicolor FISH and to look at the whole genome
However, and this multicolor FISH technology, the resolution's not better than this convention of cytogenetics
And the FISH technology is good, can get to the resolution of about 100 Kb or even less; if you use an indirect labeling, can be as low as 1 Kb
But, it's only targeted. You don't look at the whole genome
So, a need for looking at the whole genome with very sensitive and high resolution, it's emerging
And I think the technology also evolved to that point. And microarray study, microarray technology studied probably in early '90s but eventually get into clinical diagnosis in this century
So, there--if we look at all the microarray technology on the market, it's probably--we can just categorize them into two categories
So, one is CGH-based arrays, comparative with genomic hybridization. You co-hybridize your patient DNA with a normal reference DNA and look for the copy number changes
And the other technology would be the SNP-based arrays. So, because we all have one copy from our mother and one from the father and because--and there are SNPs in our genome, and they are numerous. So, we can use that technology and to--eventually to figure out whether there is a copy number change there
So, each technology has its pros and cons. And this table just summarize that. And so, for CGH-based arrays, you need to have a reverence DNA, which is good. You have intro-experimental control
You always have to have a normal DNA there as your control
And SNP-based arrays will not have that because they compare in your experimental data with a group of preset reference
And then the prompts can be for--CGH array can be backpack, oligos, depending on your needs. And you can make it bigger or smaller
And for SNP-based array, it's just oligos because you look at the, you know, small regions
SNP arrays does give you a genotype. And if you want to figure out the maternal copy and paternal allele, and you can use that data and to deduce that action now, you can use that data for your next-generation sequencing as patient ID if you're going to pour the patient blood or the DNA together
CGH array will not have this function
Also, SNP-based array, and there--in our genome, there are some regions that just don't have enough SNP and which I will show you. And so, we call the SNP desert
And in those regions, SNP array will not cover as well, but CGH array, you pretty much can put your probe anywhere in the genome
But, SNP array, because they genotype the patient, so you can detect uniparent disomy or copy-neutral loss of heterozygosity
And with the CGH array only, you will not be able to do that
So, in the genetic community, we're trying to push for a few years, and we would want to have a combined platform
And we want to know copy number changes as well as the genotyping information
So, start 2009 and a group of cancer geneticists were interested in apply this new technology. And into clinical diagnosis, we formed a consortium called the Cancer Center Genomics Microarray Consortium
And we were officially established in 2009. And we had in two-year annual meetings already
So, we decided we're going to design a CCMC array which targeted cancer. So, you know, this is our first version of array. And we targeted at about 427 Sanger genes. And those are proved well-accepted cancer genes
And we also look at genomic regions we know is commonly involved in cancer. But, we don't really know what genes involved. We know that segment is involved. So, we put in those regions as well as telomere regions that are commonly involved in cancer
We designed this array and can be on four by 44 array, two by 105 or eight by 60 or four by 180
And the coverage for those targeted genes is at least one probe per 0.521 Kb and at least one probe for exon action. We originally want two, but enough our result only had one, you know, for the minimum. And then there are like maximum of 200 probe per genes
And we also covered flanking region for each gene, about 15 Kb
And so, the remaining probes will be used to cover the interval of those targeted genes and regions
So, as I said, you know, this is just screenshot from the genome browser and to show you how those regions, targeted region and the backbone region was covered
So, as you can see, this is a part of the chromosome 11. And you see the MML gene. It's well covered. It's a big gene and with a lot of exons. So, you see a lot of proves and other genes around it
And in the backbone region, what you're going to see is dependent on what resolution of array you use. And four by 44 will have lowest coverage. And four by 180 will have highest coverage
So, start last year, we started to work with Agilent to put a SNP array on--SNP probes on this array. So, the version 2.0 CCMC array actually with SNP probes. And that is actually start on the 11th of this month. Agilent put this array on their catalog array. So, that array is with 60,000 SNP probes
So, at Baylor, and we also want to get something, you know, the resolution higher. And so, we designed another array that include all the CCMC probes that will also ID, you know, a lot of more genes
And this Baylor array is based on two by 400 K array because the four by 180s just don't have enough room for that
So, that targeted 2,300 cancer gene or cancer-associated genes and, also, those cancer genomic regions
And we also target 235 cancer-associated miRNAs
And of course, the average resolution is a lot of high. I'm not going into detail. But, we also put into 60 K SNP probes
The average exon coverage in this array is based on exon coverage for those cancer genes. It's the average probe coverage is six probes per exon so you can detect exon deletion if that's what you are--you want to do
So, with these cancer-specific arrays, and we can detect low, very low level as well as very small deletions or duplications. So, we know this array, it's very powerful. I just want to give you a quick example of this case
And this is a patient of mine. And we did this--it's an AML patient. And we found these 16 and 12 unbalanced translocation, as you can see. And the 16q was translocated to chromosome 12. And actually, this is a dicentric chromosome unbalanced translocation
So, when we did the FISH, what you can see here when we did the FISH, and we know this 12p has a gene which called the tail gene. And we expect that gene's deleted
But, our FISH showed--and tail gene not only is not deleted, but also in many cells, it's amplified
So, we did a microarray, what you see here. And the majority of chromosome 12s are deleted. But, there is a small, very small deletion of about a 17--0.79, 790 Kb. And that is duplicated
And if I enlarge, this is a part of the tail gene. Tail gene's also known as ETV6
So, there are many gene deletions. And they may not be whole gene deletion. And they could be just partial gene deletion
Look at this RB gene deletion. RB, of course, is a big gene. You see homozygotic deletion, heterozygotic deletion of different size and which is very common when you use this high-resolution cancer array. And you will be able to detect that
If you use a FISH, this is going to be normal
And this is another one with a PAXS5 deletion. And that's an ALL patient. And it's only 23 probes and that are deleted in this case
As a matter of factor, 85 percent of the copy number changes identified by microarray, and they are less than five megabits. And there are four there cytogenetically invisible
So, we also identified a lot of so-called balanced translocations. And they're actually unbalanced at molecular level. And this is just something, some case I'm going to show you, the MDS case. And the patient was transforming into AML
And you see a three-way translocation between the two-chromosome series and the chromosome 12. And also, you see this microchromosomes. And FISH showed it as chromosome seven. And the patient is also triple X
And that triple-X aberration we later confirmed that was constitutional. But, the patient never knew before
So, you see array actually identified four microdeletions
And each microdeletion's associated with one of the translocation breakpoint
So, now, I'm going to switch gears to talk about this SNP array and why we want to put SNPs into CGH array and what's the significance, especially the clinical application
So, I just want to give a little bit of background about SNP arrays. So, first, you look at what we use for SNP is we use restriction enzyme to cut some of the SNPs
So, use Alul as example, Alul cut AGCT. So, if you got AGCT here, and both of them AGCTC, and that's going to be cut. And that will give you loads to signal
And if you are heterozygose for that and you only have one copy cut and the other copy's not cut, so that will give you an intermediate signal
So, if a patient who has both allele with SNPs and then the enzyme, Alul enzyme, will not be able to cut them, and that will give you highest signal
So, and that is how we use this very small difference based on signal intensity and to detect SNP information
So, this is just an example. And this patient has a large deletion of chromosomes 3p and a part of the 3q
And you can see very clearly, and there is just a one--zero cut. It's what we call zero. And this is one because there is only A or B, either cut or not cut
If it's not cut, you get lowest signal. If it's one copy cut and then you've got the intermediate signal
And then you also see on the distort part of the Q arm, and there is a complete loss of heterozygosity
So, if I have to look at this case, I would say this is probably a patient lost a chromosome three and then, later on, in that part of the chromosome three duplicated. And then you got a complete loss of heterozygosity
Sorry, it's just not work very well, the slides
So, this one I want to show you and with the SNPs how you can deduce the allele pattern. And also, that will tell you. And your duplication of the genome or of specific chromosome, it's because of biallelic duplication or one allelic amplification
Actually, that question came up yesterday on a just--on an array study with the--without SNPs. And so, you won't be able answer that question
But, if you--with SNPs, you would be able to answer this question
So, this is a patient with CLL which started earlier this year. And you can see the patient has two small deletions, one small, one large deletion on the P arm of chromosome eight
And then you also see a large duplication on the Q arm of chromosome eight
And that duplication, if you look at the SNP data and what it can tell us is--and this duplication is caused by duplication of both alleles
And the reason I'm saying that is you will have this AAAA. Four of those alleles won't cut. You're not going to have one cut because they would duplicate it
So, you're not going to have just one--like there's one allele which is not cut
And then you will have this AABB, which is the duplication of originally supposed to be one cut
And then you have BBBB. And that is four copy. So, I know--and this is a duplication of both alleles
On the other hand, this is another example. It's from patient with AML. And you see a big duplication on the P arm of chromosome six. So, when you look at the SNP data, what you see here is you have alleles don't cut at all
And you also have alleles, and that only give you one cut. And you don't have alleles that gives you two cut. And you have alleles that give you three copies uncut and four copies uncut
And so, this one tells me there is one allele stayed the same, not changed at all, either cut or not cut
But, the other copy amplified and triplicated. So, you got three copies. So, you see both of them are four copies. But, the SNP pattern's different and assume there is a mutation on this gene, on this allele. And that mutation will be amplified at least three times
So, this is another case I want to show you. This is our MDS case. And I want to show you a phenomenon, which I call copy number gain, loss of heterozygosity
So, we all know copy number loss. You have a deletion, your loss of heterozygosity. And copy number neutral, you can loss heterozygosity as well, as I showed earlier, that chromosome 6q
But, you may have copy number gain/loss of heterozygosity, as example I show here
So, this is an MDS patient. And you see patient has homozygose deletion, heterozygose deletion. And then there are duplication of one allele, duplication of one allele
If you look at the SNPs here, you see very clearly it could be zero copy because nothing cut, and you showed here
And then there is no two copies. And then you see those copy number gain--oh, sorry--gain/loss of heterozygosity
And even though there are three copies, but there is no heterozygosity there and so which means, in this case, it's probably completely deleted one chromosome 13. And part of the chromosome 13 duplicated and then triplicated
So, this is another case of CLL. I want to show the allelic diversity. And if anyone can give me a clue and I just talk about a pattern, and if you can tell me this is because of a duplication of both allele or that's because of triplication of one allele. And they were in tears
So, basically, the pattern here shows--that's here. I'm going to pull out allele information here
And you see AAAA. And this is AAAB. So, you do have this one. And then you've ABBB and BBBB. So, you've got more B's than A's. So, you know this is an amplification of one allele and instead of duplication of one allele, of two homologous allele
So, this slide shows you how important we should do microarrays, especially for CLL
And we see a lot of amplification of chromosome eight in CLL patient. And as we know, and this probe is not in the CLL FISH panel. So, if you don't do that, you're going to miss
Now, I want to show you some examples on solid tumor and both fresh frozen tissue as well as the FFP samples
So, this is a fresh frozen tissue patient with heptoblastoma. And you can see what we identify are huge block amplification on chromosome one, 1q
And the same thing happened on another case and a similar region and of that case
And what happened was, when we did cytogenetics, what you see here on this case, it's about four, five, six copies of double minus [sp] in each case. And now, we know where the double minus came from
And this one and patient actually showed a 50 to 60 double minus in each cell
So, this represented about 30 to 60 double minus here. And I want to show you, in this region, it's identical to this previous case to this region
And so, we are--we have our resident study in this region trying to find out which is the genes responsible for that
And I also want to show you a few cases how this SNP data can help you to set the ploidy. As we know, cancer is not a constitutional disorder. And you only have like a minimum amount of changes
But, in cancer, ploidy changes and multiple copy number changes, it's a common phenomenon we have to encounter, so correctly identified the ploidy. And it's very, very important
So, this is a case of renal tumor carcinoma. And we actually did this. This is a case of FFPE sample
And so, and we got this copy number plot here. So, this case can be called as either two copy, one copy, so which we can call hypodiploidy, or you can say this could be four copies, and this could be two copies. That would be near triploidy
So, how we can decide on that, and sometimes it's just the CGH data, it could be very, very hard
And so, we look at this case along with the SNP data. So, normally, I showed earlier, PK, you have this light blue on color. You have green at one cut. And then you have this dark blue, two cuts
But, in this case, you see those weird curves. And what it tell you is, so you got A, and then you also got A minus. You got a BB. You got--also got a B minus. And the AB is pretty much disappeared
So, when we look at the SNP data and shows very clearly, and this is either monosomy or disomy
And so, this SNP pattern and get us to think and readjust our software parameters and to call the copy number change. And this actually helped us to make clinical diagnosis
So, what is those case--the case and I was showing you? And this is based on--I didn't look at the pathology slides. And just based on this pattern, I can call--and this is a chromaphobe renal cell carcinoma because of the common deletion of those chromosomes
And if I plug all the cases with the chromophobe renal cell carcinoma, and you are going to see this pattern, deletion of one, two, three--one, two--this is one. And this is two. And this chromosome six, 10, 13, 17, and 21, very, very typical
If you see this pattern, this patient will have chromophobe RCC
And this is another case and with 48 percent of the tumor cells. And as well, you can see this CGH called two copies, three copies, four copies, which is very accurate call
And the SNP also give us the correspondent data. So, we can call this case very easily papillary RCC because you see this typical duplication or triplication of chromosome seven and trisomy chromosome 17
And if we plug all of those patient data together, you are going to see this big plot blue duplication and another blue duplication of chromosome 17
This just show the--I want to show where you use FFPE samples and your SNP data is not going to as pretty. So, they--because a lot of SNPs may be disappeared because of sample DNA degradation
So, you see in this one, this is trisomy--tetrasomy seven. What I can say here is that you can see this is no cut. This is one cut. And then the three cut--the three uncut is kind of diffused. And that's four uncut
So, this is very likely a duplicate--a triplication of one allele. And there is one allele normal
And this is trisomy 17, so which is kind of easy. This is a zero, one, two, three copies
So, compared to chromatological disorders, and so this data certainly is not as pretty and as this chromatological, the DNA quantities a lot of better
And so, I'm just going to flash out those, get into because I think I'm probably over time
So, I want to show you how consistent our FFPE sample data compared to our fresh sample data. And the blue one is fresh sample, and the pinkish one is FFPE sample. And this is a case of completely concordance
But, sometimes, you also see an inconsistent result. But, I don't think of this is because of FFPE sample is not accurate
I just think this is the tumor heterogeneity and/or clonal diversity
So, you see this duplication may not be exact the same. And this, there is no deletion, but on the fresh sample, but there are deletion on the FFPE sample
So, this is a plot of a lot of renal tumor samples. And we see two big amplifications. So, we want to see exactly what happened there. What you see here is a chromosome six amplification
I want you to see--this is a fresh sample. This is an FFPE sample. So, it's a little bit, just a little bit more noisy compared to the fresh sample
And the other one is a big amplification of chromosome 18, as showed in this slide
The last example I want to show you also emphasized the importance of SNP probes. It's in many times and we need to know the--if that patient, for example, ALL patient as a childhood, and they may have a hyperdiploidy or neutral ploidy
And that is give--that will give the patient a good prognosis. But, if they are triploidy, it's because of duplication of hypodiploidy. And that will give patient a poor prognosis
So, I showed in this patient, and that's a B-cell ALL, and you can see there are duplications, triplications here
And within--at the same time, when we render SNP probes and with enumina [sp] array, as you can see, those duplicate--those diploid chromosomes, they are completely loss of heterozygosity and which is proved by SNP array as well
And for those, there are four copies. They are duplicate of two copies. See, that's AAAA, AABB, and BBBB. And it's--which is also confirmed by SNP array because, with enumina array, I think it's this software called
And if you have duplication of two copies, and it's just going to show as two copies
So, this slide just shows you, if you only use SNP, and there are regions, there are just not enough probe to cover. So, I know my time is over
Okay. So, the last point I want to make is SNP also help you to detect low-level mosaicism, as in this case. You can see actually the software did call a very small duplication on the P arm and of this chromosome two
But, the SNP data clearly showed zero, one, two, three. So, that gives me the confidence to call this low-level mosaicism
So, in summary, and with this cancer-specific array and CGH plus SNP, we can confirm, clarify, and further characterize cytogenetically and the FISH abnormal cases
But, we can also identify many submicroscopic aberrations. And the CGH and the SNP array allow us to detect loss of heterozygosity in no match [sp]. It is loss of copy, copy neutral, or copy gain
And SNP probe also can help us to confirm low-level mosaicism. And it provide important allelic information and which will help us to distinguish a hypodiploidy or it's a duplication of hypodiploidy versus triploidy
And SNP probe also help us to establish ploidy status. And we think the array should be adjunct test to the current standard care for cytogenetics and FISH
But, in many cases, such as solid tumor or some of the hematological disorders, like CLL, I would like to add probably MDS and the multiple myeloma, and that should be the first-tier test
And with this, I want to thank all the people who work with me and like Dr. Xiaofeng Hu, who's faculty of--at Baylor at CGL Lab. And she has been in charge of this program for years
And there are other people listed on this acknowledgment slide and also many people as collaborators, Dr. Anwar Iqbal, Charles Lee, Daynna Wolff, as well as many Agilent technique and bioinformatics support
With this, I will conclude my talk, and I want to thank you for your attention