by Jeffrey M. Perkel
To paraphrase The Naked City, there are three billion base pairs in the human genome; each one tells a story. Yet for the geneticists whose job it is to map phenotypic traits to genomic addresses, sifting through those billions of bases one at a time is prohibitively impractical. Instead they use proxies, probing variations in some representative fraction of the genome to home in on regions of interest.
Say a researcher wants to identify susceptibility loci for a particular condition, such as lung cancer or Alzheimer's Disease, but doesn't know where to look. To cast as wide a net as possible, he might scan hundreds of thousands or even millions of variations simultaneously, looking for those that consistently segregate with the disease in affected versus control individuals.
Alternatively, the researcher may want to know whether specific genes are associated with the condition, because his lab focuses on a particular biochemical pathway or process, for instance. In that case, he might run a more targeted study, looking for linkage with a relative handful of variations in selected candidate genes.
Or, maybe the researcher has already identified a particular genetic marker that associates with the condition he studies, and now wants to scan individuals for their genotype at that particular location to the exclusion of all others.
Generally speaking, these markers are single nucleotide polymorphisms, or SNPs. As their name implies SNPs are variations at single base positions between individual members of a species. Suppose, for instance, that you could sequence the DNA of an entire population over a particular 1,000 bases. You might expect the resulting sequence to be identical over the population, but in general, it will not be; 70% of individuals, for instance, might have an A nucleotide at position 437, while the remainder have a G. That difference is a SNP, and literally millions of such variants have been identified.
Reagent manufacturers have devised genotyping methods to accommodate just about every SNP density and sample throughput need. Though the methods vary in cost, principle, and required instrumentation, they nevertheless all answer essentially the same question: for each SNP tested, which alleles does an individual carry?
For the broadest SNP coverage, researchers generally turn to microarray-based approaches, such as those from Illumina or Affymetrix, in which each array position corresponds to a particular genomic variant. (Roche Nimblegen also offers an array-based genotyping product, but it specifically probes copy-number variants, or CNVs; the Nimblegen 2.1M Whole-Genome Tiling Array probes 2.1 million CNVs per slide.)
Illumina offers four off-the-shelf products in its BeadChip-based Infinium product line: the HumanCNV370-Quad, the Human610-Quad, the Human1M-Duo, and the HumanExon510S-Duo. These cover 373,000, 621,000, 1.2 million, and 511,000 genomic features, respectively, for each of two or four samples per slide.
In the Infinium assay, genomic DNA is PCR amplified, fragmented, hybridized to the array, and then extended by one base, which is fluorescently labeled. The allele is called based on the resulting color on the array.
"You have two primers basically hybridizing to your DNA right at your SNP site, and depending on the allele of the SNP, you get a single base incorporated into your oligo, which you then detect [based on color]," says Carsten Rosenow, Senior Marketing Manager for Illumina's DNA analysis product line.
Affymetrix' alternative is the Genome-Wide Human SNP Array 6.0 which probes some 1.8 million genetic loci, including 906,600 SNPs and 946,000 CNVs, based on a "whole-genome sampling assay." "Integrating SNP genotyping, Copy Number Polymorphism (CNP) genotyping and rare CNV identification in one application, the SNP 6.0 Array enables the researchers to bridge CNP genotyping and classical SNP genotyping analysis in one genome-wide association study and/or high-resolution cytogenetic analysis," says Mindy Lee-Olsen, Vice President of Commercial Marketing.
Illumina also supports two finer levels of SNP density. Running the same Infinium assay as larger BeadChip arrays, iSelect chips probe between 6,000 and 60,000 variants (either from the off-the-shelf BeadChip products or elsewhere) each, for up to 12 samples per slide. And the company's GoldenGate assay can probe between 96 and 1,536 SNPs per sample.
GoldenGate is a three-primer reaction, in which one primer is a kind of barcode for the assay, and the other two discriminate the SNP. First, the genomic DNA is incubated with all three primers, DNA polymerase, and ligase in a primer-extension reaction that creates a template for PCR. The extended product is then amplified using fluorescently labeled allele-specific primers, hybridized to tagged beads on a solid support, and the results read.
Traditionally, that solid support was a bead matrix in a 96-well microtiter-plate format, but recently, says Rosenow, the company released a 12-sample chip-based format, as well.
Sequenom's iPLEX Gold assay can probe up to about 36 SNPs per sample, says Chief Scientific Officer Charles Cantor. Like Infinium, iPLEX is a single-base extension reaction. But detection is based on mass, not color, Cantor says. Essentially, you anneal a primer just upstream of the SNP, and then provide four possible terminator nucleotides. The SNP allele is called based on the mass of the resulting product as measured in Sequenom's MassARRAY mass spectrometer.
"This is like shooting fish in a pond as far as mass spec is concerned," he says.
Marligen Biosciences' Signet custom and off-the-shelf genotyping assays can target up to about 15 SNPs per sample, says Chief Scientific Officer James Lazar.
Signet is based on Luminex's FlexMAP technology, a fluid-based microarray in which each color-coded bead is akin to a particular spot on a standard chip. According to Lazar, from 5 to 15 SNPs are PCR amplified out of a genomic DNA sample in a multiplexed PCR reaction and subjected to primer extension. The extension primers are key to the reaction, says Lazar: on their 5' ends, they contain a sequence that is complementary to one of the FlexMAP beads; on their 3' ends is a base complementary to one of the SNP alleles. If the 3' end of the extension primer is complementary to the SNP base, the polymerase will be able to extend, incorporating biotinylated nucleotides as it does so; if the primer is not complementary at the 3' end, nothing will happen.
"The specificity comes from the combination of the primer, which may or may not be mismatched at 3' end, and the polymerase that's used," says Lazar. "We use a polymerase that is very specific for perfect matches on the 3' end."
To read the reaction, the extended DNA is annealed to the FlexMAP beads (each bead corresponds to one SNP allele), incubated with streptavidin-phycoerythrin, and read in a Luminex reader.
For the most focused studies, involving from one to maybe four SNPs, geneticists have several options. One is Idaho Technology's Hi-Res Melting.
According to lead scientist Cameron Gundry, Idaho Technology supports two different assays based on the method, small amplicon genotyping and Lunaprobe genotyping. Both are PCR-based, and both depend on a high resolution melting saturating dye such as LCGreen Plus, whose fluorescence intensity is stronger when bound to double-stranded DNA than free in solution.
In Lunaprobe genotyping, an unlabeled primer that covers the SNP region is annealed to the amplified product at the end of the PCR reaction. The system then measures fluorescence intensity as it slowly heats the annealed complex, calling the SNP based on the fact that a slightly mismatched DNA duplex (i.e. one in which the SNP does not correspond to the probe) will have different melting properties than will a perfectly complementary one.
"If it's a perfect match, you're going to have a higher melting temperature because it's bound much more tightly to its complement, whereas if it's sitting over where the mismatch is, there's some instability there because it's not a perfect match, and so therefore the melting temperature … is going to be lower," says Rachel Jones, Vice President of Sales and Marketing.
Another option is Applied Biosystems' TaqMan® chemistry. TaqMan involves four oligonucleotides: two PCR primers and two allele-specific dye-labeled probes, one for each SNP allele. Each allele-specific probe is tagged on one end with a fluorescent dye, and on the other with a fluorescent quencher that snuffs out the fluorescence. During amplification, whichever probe corresponds to the allele can hybridize to the template, between the two amplification primers; when the polymerase reaches the probe, it degrades it, releasing the fluorophore into solution.
According to Phoebe White, the company's Senior Director of Genotyping Applications, TaqMan is a one-tube, one-SNP assay. Yet it also is "very accurate, very clean," and "You don't need specialized equipment." And users could hardly ask for more variety; Applied Biosystems' catalog has swelled to more than 4.5 million TaqMan assays, she says.
Recently Applied Biosystems announced the launch of the TaqMan OpenArray Genotyping platform, a nanofluidic array that enables customers to perform TaqMan genotyping at much higher throughput and lower cost per genotype than previously possible. "What this technology does is effectively multiplex the assay so you can interrogate up to 256 SNPs per sample," she says.
With so many tools and reagents to choose from, the stories of the human genome have never been easier to read.