Copy Number Variation

Editorial Article

Article Tools
  • Email a Colleague
  • Print
  • Comments
  • ShareThis

Monday May 04, 2009

by Jeffrey M. Perkel

The more we learn about the human genome, the more we recognize it for the dynamic structure that it is. Differences between "normal" individuals can come in many forms, from gross chromosomal variations to the nucleotide-level changes called single nucleotide polymorphisms, or SNPs. In two seminal 2004 papers in Science and Nature Genetics, researchers reported yet another, previously unknown layer of variation.

"For many years, the genetic community was looking at SNPs as the bulk of variation between individuals that could be associated with different phenotypes," says Dione Bailey, product manager at Agilent Technologies. "But, there were larger structural changes between individuals that could also be associated with population differences."

Called "copy number variations," or CNVs, they are exactly what their name implies: regions of DNA, generally larger than 1,000 bases long, whose chromosomal copy number differs from individual to individual. Bigger than SNPs, but too small to be detected visually, these regions may be deleted, duplicated, triplicated, and so on. Like SNPs, CNVs can serve as genomic landmarks for genome-wide association studies. And, just like SNPs, they may be either benign—mere mile markers on the genomic highway—or causative of, or at least be associated with, disease, disease susceptibility, and other phenotypes.

Since 2004, geneticists have rushed to incorporate CNVs in their analyses, and the ranks of these genetic variations continues to grow; the Database of Genomic Variants (DGV) which records CNVs, currently lists more than 38,000 entries. To aid this research, biotech firms have assembled a small but growing collection of tools for CNV discovery, validation, and subsequent analysis.

Four companies, Affymetrix, Agilent Technologies, Illumina, and Roche NimbleGen, offer microarrays for genome-scale CNV discovery and association studies. Agilent and Roche NimbleGen's arrays utilize a two-color assay, in which two DNA samples—a test and a reference—are labeled with different fluorescent dyes and then cohybridized to the array; copy number is assessed by the relative signal intensity of the two samples detected by each probe. Illumina and Affymetrix, on the other hand, hybridize just one sample per array, calling CNVs by comparison to a reference data file (Illumina) or on-chip controls (Affymetrix).

Agilent Technologies

Agilent's product line includes arrays for both CNV and array comparative genomic hybridization (aCGH). The two experiments are fundamentally the same, says Bailey, except that aCGH assays tend to focus on genic regions, whereas CNV arrays take either a more unbiased tiling approach or target regions of known copy number variation.

In addition to aCGH arrays in 1x1M, 2x400k, 4x180k, and 8x60k formats (that is, one array per 1" x 3" glass slide with one million probes; two arrays per slide with 400,000 probes each, and so on), the company currently offers two catalog arrays specifically for CNV analyses. The first, a 2x400k array, draws heavily from the DGV, which mines the published literature, says Bailey; the other, a 2x105k product launched in early April, stems from the work of the Wellcome Trust Case Control Consortium (WTCCC), a 19,000-sample CNV association study.

"The difference between this [2x105k] array and the 2x400k," says Bailey, "is [the 2x105k array] is designed against the knowledge the WTCCC had from doing other high-resolution copy number variation experiments. Some of the CNVs on the array are in the Toronto database [DGV]; others were identified from their previous studies, and are often smaller than what is typically in the database."

Both arrays use Agilent's in situ synthesized 60-mer oligonucleotides. So will a soon-to-be-released 1x1M "high-resolution discovery" array, which, says Bailey, is completely unbiased in its probe distribution, offering an average of one probe every 3,000 bases. Users may also design their own arrays in any of eight formats (1x1M, 2x400k, 4x180k, 8x60k, 1x244k, 2x105k, 4x44k, and 8x15k), pulling content from Agilent's database of 24.3 million probes.

Roche NimbleGen

Roche NimbleGen also offers CGH arrays in a variety of formats. The highest density and resolution of these comprises 2.1 million probes and enables detection of human CNVs genome-wide with nearly 5-kb resolution. Multiplex 3x720k and 12x135k formats have recently become available for increased throughput and cost-effective analysis of whole genomes or targeted loci.

According to Vanessa Ott, product manager for the CGH and CNV product line at Roche NimbleGen, "our unique combination of high probe density, expanded genome coverage, and flexible array format and design provides researchers with one of the most comprehensive array portfolios for high-resolution CNV discovery as well as higher throughput analysis of targeted regions."

The company offers numerous catalog arrays with unbiased whole-genome tiling designs for human and model organisms including mouse, rat, dog, cow, and non-human primates. According to Ott, NimbleGen is now focused on expanding its aCGH portfolio to include exon-focused and CNV-focused designs in a variety of array formats. In addition, NimbleGen arrays are "completely customizable," says Ott. "With our digital design and manufacturing capabilities, custom arrays can be quickly designed and created that consist of whole genomes, a single chromosome region, or a small number of targeted loci."

Illumina

Illumina's SNP-based product line features a range of Infinium HD arrays, says Dan Peiffer, product manager in Illumina's DNA Analysis Group. In this case, SNPs are used as surrogates for CNVs, with the idea being that any deleted or amplified region that contains a SNP can be detected (though the assay may also use probes in regions where SNPs can’t be found, of course). The assay is dual-color. For each position on the array, a primer, bound one base upstream to the SNP, is used to prime the addition of a single extra, fluorescently labeled nucleotide. The resulting color indicates the SNP.

"The benefit of our Infinium HD arrays is you can obtain both genotyping information as well as CNV intensity measurements," Peiffer says.

Illumina’s HD product line includes the HumanCytoSNP 12 array, a 12-sample array, each sample containing 300,000 markers (12x300k), “positioned for the entry level genotyping and molecular cytogenetics market," says Peiffer. Next are the company's Human610-Quad and Human660W-Quad SNP arrays, the latter of which "contains most popular SNPs but boosted with additional content from thought leaders in the CNV field." Also available are custom-built arrays with up 200,000 probes on each of 12 samples per slide. Finally, there's a 2x1.2M chip, "the cream-of-the-crop product," as Peiffer puts it, featuring one marker every 1,000 bases.

Affymetrix

The final array provider for the CNV market is Affymetrix, which recently released its so-called "Cytogenetics Solution," featuring arrays, reagents, and software for cytogenetics analysis. The solution includes two arrays, the Affymetrix Cytogenetics Whole-Genome 2.7M Array, targeting 2.7 million genomic features, and the 330,000-probe Cytogenetics Focused Array. Each targets both CNVs and SNPs, but custom arrays are not available, says Candia Brown, associate director of product marketing for DNA Applications.

These arrays, Brown says, are not intended to supplant the company's flagship Genome-Wide Human SNP Array 6.0. Both the Cytogenetics products and the 6.0 Array "provide coverage of the whole genome for both CNV and SNP detection," she says. "At the same time the 2.7M Array provides even higher coverage and confidence to detect CNVs and UPD [uniparental disomy] events than the SNP Array 6.0, especially in relevant regions like subteleomeres, haploinsufficiency genes, cytogenetic hotspots and cancer genes."

The Cytogenetics Solution also features a simplified assay, Brown says—"the new workflow does not include PCR amplification and is optimized to eliminate several washing steps"—and a dedicated software package, called the Chromosome Analysis Suite, "for graphical karyoview and chromosome view to enable instant detection of aberrations."

TaqMan® Copy Number Assays

Once users have identified, via genome-wide association studies, a set of candidate CNVs that may be associated with a particular phenotype, they must then validate that set on a larger set of samples. That's where TaqMan® Copy Number Assays from Applied Biosystems (a division of Life Technologies), comes in.

"Arrays are very powerful for genome-wide copy number screening, but copy number calls tend to be more qualitative in nature," says Elizabeth Goley, product manager for Applied Biosystems Genotyping Assays. "TaqMan Copy Number Assays are more quantitative. Within a few hours you can validate your array findings with very specific and reproducible results on as many samples as you want."

It is also faster, higher-throughput, and much more flexible, as reactions are individually ordered. Users can select from any of Applied Biosystems' 1.6 million pre-designed assays (released in March 2009), Goley says, "or we can design a custom assay to any target sequence you submit."

TaqMan assays are quantitative PCR (qPCR) assays that are also used for SNP genotyping and gene-expression analyses. According to Goley, the CNV application is slightly different: a duplex PCR reaction, with one detector for the target of interest and another for a reference assay, which has two copies across samples. Copy number is determined via relative quantification of the two reactions.

Besides arrays and qPCR, researchers can also interrogate CNVs using fluorescence in situ hybridization (FISH). But, that technique is relatively low-resolution, low-throughput, and can probe only a small number of regions at once. Another approach is Multiplex Ligation-dependent Probe Amplification (MLPA), commercialized by MRC Holland, which combines multiplex PCR and capillary electrophoresis.

But perhaps one of the most promising, up-and-coming approaches is next-generation DNA sequencing. With that approach, says Bailey, researchers can study CNVs with what she calls "absolute resolution," as breakpoints can be identified at the nucleotide level. That's not to say more traditional approaches are going away, she cautions.

"Using [next-gen sequencing] for CNV analysis is a young and emerging field," she says. "It's unknown what the false positive/false negative rates are, and what are the limitations of that technology."

Additional Article Links

Comments

advertisement
Advertisement (image not found)

Email Newsletter Sign-Up

Stay updated on the latest technologies and news with Biocompare's newsletters
(See samples here)






Select All

Loading

Loading