Biobanking: Global Repositories for a Multitude of Samples

 Biobanking: Global Repositories for a Multitude of Samples
Jeffrey Perkel has been a scientific writer and editor since 2000. He holds a PhD in Cell and Molecular Biology from the University of Pennsylvania, and did postdoctoral work at the University of Pennsylvania and at Harvard Medical School.

If you want to study genetic variation in relation to phenotypic information, especially in the context of rare genetic variants, you’re going to need a large sample set. If you’re lucky, the needed samples are already sitting in the freezer. More likely, though, you’ll need to obtain them elsewhere.

One potential source for large sample numbers is a biobank. As the name suggests, biobanks are archives or repositories for biological materials of all sorts. “The wide array of biospecimens (including blood, saliva, plasma and purified DNA) maintained in biobanks can be described as libraries of the human organism,” states the Coriell Institute for Medical Research, which runs one such resource. 

According to Andrew Brooks, the chief operating officer for RUCDR Infinite Biologics, a Rutgers University-run biorepository comprising some 12 million DNA samples and 8.5 million cell lines, biobanks come in all shapes and sizes, and with different levels of access. Some are general-purpose archives, and others focus on specific medical conditions. Some make samples publicly available, others are private, and still others are in between—open only to research members of particular clinical studies, for instance.

More than a freezer

So just what, physically, is a biobank? Fundamentally, it could be as simple as a dedicated laboratory freezer accompanied by rigorous documentation. But according to Matthew Hamilton, president of Hamilton Storage, many biobanks opt for more elaborate setups to ensure consistency, accuracy, documentation and quality of banked samples. “It’s not that manual isn’t still done,” he says, “but it just became accepted over time that there are more efficient ways to store precious samples, using robotics.”

One significant issue with traditional freezers, for instance, is that every time they are opened and closed, the samples can warm up. Also, when freezers are freely accessible, samples can become lost, misplaced or contaminated. To circumvent those problems, Hamilton Storage sells an automated freezer system called Hamilton BiOS, for storage of 10,000 to more than 10 million samples. Rather than adding and removing samples manually, interaction is via a computer interface. The system inserts and retrieves samples on demand, logging not only the locations of samples, but also when they were retrieved, by whom and for how long. “The robotics are used to implement best practices of the industry, as sample integrity and audit trails are needed for reliable, accurate data for published studies,” Hamilton says.

Hamilton adds that his company is working with the LifeLines Biobank in Gröningen, the Netherlands, on a repository that ultimately will store some 13 million samples.

Extensive ‘typing’ of samples

UK Biobank, comprising the biological material of some half-million British citizens, is one example of a public biobank. According to Tim Peakman, deputy chief executive of UK Biobank, the individuals who were recruited into the bank were not chosen to match any particular condition. The only recruitment criterion was age: All participants were between 40 and 69 years old. These individuals were extensively phenotyped with lifestyle and psychological assessments and physical measures, and a range of biological specimens were collected. Participants also consented to having their medical progress tracked over time, making it possible to build large patient cohorts and correlate biomarkers with biology.

“This is a fully open-access resource,” Peakman says. “The Biobank built the resource, it is researchers’ job to make use of it.”

At a minimum, biobanks typically record basic information about a sample, such as gender, age and source of a sample. But they often supplement those data with more specialized information, which can increase the bank’s value to researchers. UK Biobank, for instance, obtained funding to perform MRI and ultrasound imaging on 100,000 of its participants, Peakman says, and to genotype all participants.

Using a custom microtiter-plate-based microarray called UK Biobank Axiom® Array, UK Biobank has collected data on some 820,000 single-nucleotide variants for each of its 500,000 individuals, producing a collection of more than 400 billion data points. Those data were then passed to the Wellcome Trust Centre for Human Genetics at Oxford, which applied bioinformatics algorithms to “impute” (or infer) another 70 million bases per genome, Peakman says. This represents about 2% of the genome overall.

According to Laurent Bellon, senior vice president and general manager for the Genetic Analysis Genotyping Business at Affymetrix, the company has developed custom genotyping arrays for a number of biobanking projects around the world, including UK Biobank, the Million Veterans Program at the U.S. Department of Veterans Affairs and biobanks in both China (500,000 individuals) and Korea (600,000).

“We have extensive customizability capabilities,” Bellon says. “Different biobanks have different scientific objectives that need tailored content, and no two designs are the same.”

In one recent example, researchers used the data from 50,000 participants in UK Biobank to search for genetic variants associated with smoking and lung functioning, identifying six new candidate loci [1].

Researchers who are not part of a biobank study can typically obtain these arrays, too, Bellon says. By using them, they can easily compare their data set to that of the bank and enrich their data by building a larger-scale meta-analysis.

Expanding applications

Illumina also offers microarrays specifically intended for biobanking applications, says product manager John Picuri. These include the CoreExome-24 and OmniExpress-24, as well as the company’s newest offering, the Multi-Ethnic Array family. Comprising “over 400,000 exonic markers and over 20,000 hand-curated markers from OMIM, ClinVar and PharmGKB,” these arrays include variants targeting “all of the world’s populations,” as well as “more targeted arrays focused on European, East Asian and South Asian populations (EUR/EAS/SAS) and Hispanic and African-American populations (AMR/AFR).”

Such arrays have a range of applications, Picuri says, including stratifying clinical-trial populations and disease-specific research. For biobanks with a specific research aim (e.g., a specific disease or cluster of diseases) “performing genetic analyses allows for a better understanding of the underlying genetic basis of the disease or phenotype.”

Unfortunately, access to biobanked samples and data is not universal, says Brooks. In some cases, he says, samples are more or less generally available for a nominal fee. In other cases, researchers must apply to the funding agency or a project’s research coordinator. But in any event, the effort is likely worth it: For the researchers who can get access, the result is a wealth of data they might never otherwise be able to tap into.


Reference

[1] Wain, WL, et al., “Novel insights into the genetics of smoking behaviour, lung function, and chronic obstructive pulmonary disease (UK BiLEVE): a genetic association study in UK Biobank,” Lancet Respir Med, 3:769-81, 2015. [PMID: 26423011

  • <<
  • >>

Join the discussion