From targeted looks at populations, pathways, and pathologies to pan-specific genome-wide association studies (GWAS), determining which individuals carry which alleles of which genes is an indispensable step in ascertaining everything from ancestral lineage to drug susceptibility to the likelihood of transplantation rejection to disease etiology. Single nucleotide polymorphisms (SNPs), insertions and deletions (indels), and even copy number variations (CNVs) are all looked to as indications of how individuals are different as well as how they are related. And in the case of diseases such as cancer, these can be used to track a tumor’s origin and descent.

Although next-generation sequencing (NGS)—whether of the whole genome or a selection thereof—has become the go-to technique for much of contemporary genomics for the discovery of previously unknown alleles, this relatively data-intensive and costly technique has yet to surpass more tried-and-true genotyping technologies when it comes to genotyping large numbers of samples.

Genotyping

Leaving sequencing aside, the most common forms of genotyping include both randomly primed and targeted PCR-based techniques, capillary electrophoresis sequencing, restriction enzyme digestion, and allele-specific oligonucleotide probe detection, as well as microarrays, notes Penny Smorenburg, support and applications manager for Kapa Biosystems, a subsidiary of Roche Sequencing Solutions. Various hybrid methods—combining restriction enzymes or PCR with some form of NGS, for example—also are being used to query the genetic variation found among samples.

Among the principal considerations when choosing a genotyping technology are the number of samples and the number (and type) of variables to be tested. In addition, the per-sample and per-datapoint costs, throughput, availability of equipment, bioinformatics, and other required expertise, turnaround time, and hands-on time influence the methodology selected.

Sanger sequencing by capillary electrophoresis is “the gold standard … the platform of choice when it comes to validating clinical variants or pathogenic variants,” says Shantanu Kaushikkar, director, genotyping microarray platform, for Thermo Fisher Scientific. That being said, to look at just a few markers across many, many individuals, PCR may be the best choice.

A host of qPCR-based formats exist for genotyping small numbers of markers, with a variety of panels available from different vendors and contract-service providers, using TaqMan-type probes or SYBR-like chemistry, for example. These may be provided in 96- or 384-well plates and run in standard, real-time instruments.

Standard (endpoint) PCR is typically assessed following staining and electrophoresis and is generally not the assay of choice for genotyping. Digital endpoint PCR (dPCR), on the other hand, uses a fluorescent readout, enabling significantly higher throughput.

Biocompare’s Digital PCR Search Tool
Find, compare and review digital PCR
systems from different suppliersSearch

Fluidigm’s microfluidics-based Juno system can query up to 96 samples for 96 SNPs on a single integrated fluidics circuit (IFC). “It will take all your samples, all your assays. It will do all the mixing for you, and then it does the thermocycling,” says Paul Streng, product manager for genotyping at Fluidigm. The IFC chip is then imaged, and the data is processed by the company’s software. In addition to very low hands-on time, the platform boasts low sample and reagent usage—about 2.5 ng/SNP. Fluidigm’s platform also can be used to prepare up to about 5,000 targets for genotyping by sequencing (GBS).

There are other, more conventional methods to prepare for GBS and other targeted sequencing-based genotyping approaches, ranging from restriction enzyme-based enrichments to multiplex amplification developed as home-brew or available off-the-shelf as well as custom target enrichment panels and kits.

For example, Thermo Fisher’s AgriSeq™ and Eureka panels are designed to prepare samples for NGS of a few tens to thousands of markers, using a proprietary on-target SNP-specific library preparation method that avoids some of the biases associated with multiplexed amplification methodology, and at the same time introduces allelic and sample barcodes, depending on the technology, enabling unprecedented multiplexing levels, says Kaushikkar. So the sequencing doesn’t have to read all the genomic region through the variant of interest to ascertain the genotype, decreasing the sequencing time and increasing the accuracy of the SNP genotyping.

Endpoint PCR also can be coupled with mass spectrometry instead of NGS, as is the case with the Agena Biosciences (formerly Sequenom) MassARRAY® System. Mayo Clinic Genotyping core has been using these panels “to genotype large numbers of SNPs in a cost-effective manner” following the discontinuation of Illumina’s Golden Gate chemistry, says core director Julie Cunningham. Up to 40 different SNPs can be interrogated in a single multiplex reaction in 96- or 384-well formats. “So it’s a moderate throughput, and they prove very useful and fairly robust,” she says.

Microarrays

Such targeted approaches are often used in follow-up studies once the researchers have narrowed down markers of interest from the much larger numbers of SNPs and indels interrogated in smaller numbers of samples by microarrays. Some of these latter cover millions of variants, and because some markers are in strong linkage disequilibrium, by imputation, if you put one on your array you actually get the genotype for many more in the same haplotype block.

“We’re actually seeing a trend toward more population-specific arrays,” says Kaushikkar. Here both catalog (like the Axiom™ Asia Precision Medicine Research Array) and custom arrays—often in collaboration with national biobanks—allow a greater proportion of regional variation to be queried.

In addition to high-density, genome-wide arrays, “both Illumina and Affymetrix have come out with newer generations of cost-effective arrays that have a very defined content that may be specific for specific populations,” says Cunningham. “We are running a number of those for biobanks, for example.”

Arrays can be ordered as either off-the-shelf products or with custom-designed content of a customer’s list of SNPs and other variations.

There is little question that falling NGS costs and improved multiplexing have eroded microarrays’ genotyping market share. Arrays must be factory-programmed, and “what a lot of customers want these days is that flexibility to swap out assays and samples,” notes Streng. On the other hand, he says, it can be very complex and expensive to prepare samples for sequencing by traditional means.

Sequence first, ask arrays later

Once the discovery work has been done, there’s no need to sequence every individual in a large population, and so an array makes the most sense, says Kaushikkar. “Customers prefer to use arrays over NGS” —it’s quicker, easier, cheaper and more reproducible.