Characterizing the Microbiome

 Microbiomics
Josh P. Roberts has an M.A. in the history and philosophy of science, and he also went through the Ph.D. program in molecular, cellular, developmental biology, and genetics at the University of Minnesota, with dissertation research in ocular immunology.

The microbiome, widely defined, is the community of all the microorganisms in a location. Microbiomics, in turn, is the study of that community and its ecosystem. The same kinds of questions can be asked about the microbiome that traditional (macro)ecologists ask about traditional (macro)ecosystems, says Eric Wommack, co-editor-in-chief of the journal Microbiome. But instead of how the Yellowstone ecosystem exhibits a balance of willows, wolves and elk, in the microbiome “that ecosystem function is how well your gut works, or how well the sewage treatment plant works, or the fertility of an acre of soil, or the ability of an estuary to tolerate oil pollution.”

In place of clickers and clipboards, microbiomists use high-throughput DNA sequencingmass spectrometry (MS), biochemical analysis and other tools to figure out which organisms are present, how many of each there are, what genes they carry, what RNAs they express, what proteins they make and what metabolites they generate. Often, the goal is to gain a better understanding of the relationships among community members so as ultimately to be able to manipulate them.

Resolving cliques with 16S

There are many ways to study the collection of microbes in a system, says Jonathan Eisen, professor of medical microbiology and immunology at the University of California, Davis. You can look in a microscope (and many people still do this), but microscopy is not very high-throughput, and many microbes that are distantly related may look the same. You can culture the organisms, but again, culturing is not very high-throughput, and many organisms from particular environments don’t grow well in a lab. In fact, it’s estimated that upwards of 90% of commensal bacteria species in the human intestine cannot be cultured using current techniques.

More than 20 years ago, researchers began characterizing microbes on the basis of their DNA, specifically the 16S ribosomal RNA gene, which can be used to differentiate nearly any genera of bacteria or archaea with a single set of well-chosen primers. “That chugged along slowly and expensively, but people did it, because they thought it was worthwhile,” Eisen recalls. “What’s gone crazy in the last couple of years—and the reason there seems to be a microbiome article every other day—is that Illumina sequencing has gotten cheap and easy.”

The predominance of data out there right now is from amplicon sequencing of the 16S gene, says Bryan White, director of microbiome projects in the division of biological sciences at the University of Illinois at Urbana-Champaign. White likens the microbiome to different inhabitants of a city and microbiomics to census taking. 16S amplicon sequencing enables the researcher to take a census of who’s there, and how many there are. After running typically millions of sequences, those sequences with at least 97% similarity to each other are binned together—white males or Hispanic females, for example. Each bin is considered a distinct species or operational taxonomic unit (OTU). There are approximately 2,000 OTU from a human reproductive-tract sample and about 40,000 OTU from a human gut, and with “soil, we can’t do an estimate, it’s so high,” White says. The number of sequences found in each bin represents the relative preponderance of that OTU.

Yet just sequencing the 16S gene puts benign E. coli K12, pathogenic E. coli 0157H7 and Proteus (which shares 98% of its sequence with E. coli) in the same bin. To infer from that kind of data what the community is doing “is pretty bold,” says White, referring to algorithms that try to do just that.

Getting accuracy on shotgun approaches

Because of the sequencing depth at which it’s carried out, 16S profiling is able to query essentially the whole community, “but I don’t get any information about any other genes,” says Wommack.

By using metagenomics—“where you go in and shotgun sample genomic DNA out of an environmental sample and then sequence that, without a particular gene target in mind”—it’s hard to sequence deeply enough to assure that the rarest of organisms has been sampled, Wommack says. But the tradeoff is that “I can start to look at actual functional genes—genes that are actually doing things and responsible for particular functions that are important in an ecosystem, or produce an important metabolite.”

Wommack uses whole genome sequencing (WGS)—generally paired-end Illumina sequencing, sometimes coupled with Pacific Biosciences-based long-read sequencing (“the longer the better”)—to query the viral microbiome. For such studies, 16S sequencing is not even an option, because “there’s no equivalent of 16S in viruses.” He then tries to assemble sequences, essentially whole (or large portions of) viral genomes, and looks to see what genes they contain. Metagenomics, he says, has enabled him to find certain genes that could be very helpful in doing population-scale characterizations of viral communities and subcommunities that would not have been possible otherwise.

WGS is being used routinely to do species-level identification of cellular organisms, but “the part that’s really hard is getting the functional information from shotgun data,” says Eisen. The bioinformatics becomes a mess when you “consider that two related E. coli strains can have really different functional contents, while dissimilar taxa can have the same functional contents.”

Reasons for this include the lack of many reference genomes against which to align most cellular organisms and the lack of adequate annotation in those reference genomes that do exist, adds White. The National Institutes of Health (NIH)-funded Human Microbiome Project (HMP) is helping with that goal for the human microbiome—in mid-2012, it boasted 800 sequenced reference strains [1]. Ultimately, a focus of the HMP is to “sequence 3000 genomes from both cultured and uncultured bacteria, plus several viral and small eukaryotic microbes isolated from human body sites.” This complements the European-funded MetaHIT project, which ended in 2012 and focused on the healthy and diseased gut. Yet these together represent only a small fraction of the total number of OTUs. And there is still no gold standard for aligning and analyzing the data, nor for “what to do with the metagenome data once you get it,” observes White. “It’s an evolving field.”

Researchers also are going “beyond the genetic potential and looking at what genes are expressed—but very rarely, because it’s hard,” says Eisen. They are trying all the same tools and techniques to look at microbiomes as are being used to understand the human and other genomes: “They are doing transcriptomics and metabolomics and epigenetics and network analysis. It’s just a lot messier when you have a mixed sample or organisms than when you have one organism.”

Sample prep

An underappreciated part of microbiomics that can influence results is sample preparation and storage. This includes separation techniques to reduce complexity, extraction and storage protocols that help obtain representative samples and even optimizing the number of PCR amplification cycles so as not to unduly introduce bias. “You’re trying to measure the relative abundance of taxa that have vastly different cell shapes and cell walls and cell membranes and DNA packaging proteins and components,” Eisen points out.

Ryan Kemp, director of nucleic acid solutions at Zymo Research, recommends using a mechanical homogenization technique, rather than chemical or enzymatic lysis, to help achieve a “nonbiased extraction of the microbes.” He also advocates a room-temperature nucleic acid stabilizer, such as Zymo’s DNA/RNA Shield™, which inactivates nucleases and viruses to “immediately freeze the sample in time”—otherwise the composition of the sample is likely to change.

The field is rapidly evolving and perhaps even maturing. A year ago, Eisen would have said it was primarily discovery-based, but he has noted significantly more hypothesis-driven research of late. As for what’s next: To achieve the gold standard of understanding causality (as opposed to just association), we need to use a multi-omic approach, opines White. “It’s the absolute future of microbiomics.”

Reference

[1] Human Microbiome Project Consortium, “A framework for human microbiome research,” Nature, 486:215–21, 2012. [PubMed ID: 22699610]

Image: Detail of the E. coli K12 chromosome map, created using BacMap.

  • <<
  • >>

Join the discussion