Emerging Pathogens: Out of the Shadows, into the Database

 Emerging Pathogens: Out of the Shadows, into the Database
Josh P. Roberts has an M.A. in the history and philosophy of science, and he also went through the Ph.D. program in molecular, cellular, developmental biology, and genetics at the University of Minnesota, with dissertation research in ocular immunology.


When Clair Fraser, director of the Institute for Genome Sciences at the University of Maryland School of Medicine, got involved with microbial genomics 20 years ago, it would take a year or more to sequence a single genome. Today, “most genome projects are being used to compare hundreds, if not thousands, of strains of a pathogen of interest” already found in databases, Fraser points out.

Over the years, this informative technology has gone through several iterations—including Sanger sequencing, pyrosequencing and the first massively parallel “next-generation” sequencers introduced a decade ago. And, of course, pathogens weren’t always—and largely still aren’t—identified by sequencing. Here we look at the ability to identify and surveil for emerging diseases, and how the process has evolved.

What is an “emerging” pathogen?

The Emerging Infectious Diseases journal, published by the Centers for Disease Control and Prevention (CDC), defines an emerging infectious disease as one that that been increasing in prevalence or threatens to do so in the near future. This includes not only previously characterized diseases but also those—like Severe Acute Respiratory Syndrome (SARS) and Middle East Respiratory Syndrome (MERS)—that haven’t been seen before.

And then there are the re-emerging diseases, caused by known pathogens, that have either been dormant for some time or have been eliminated in a geographic region but have pocket outbreaks, explains Kelly Wroblewski, director of infectious diseases at the Association of Public Health Laboratories. Measles, Ebola and cholera are examples of the latter: “You have outbreaks, and then you manage to contain them, but there will be another one at some point. Those are as opposed to viruses that are more endemic, like influenza, which are seasonal and never totally go away.”

It is perhaps more a point of semantics than science whether the 2011 European E. Coli outbreak—causing about 3,500 hospitalizations and 50 deaths—was caused by a novel pathogen. Sequencing the associated bacteria revealed to researchers that it was effectively a known organism containing a known additional virulence factor, but the two hadn’t historically been seen together. In this case, the strong binding ability of enteroaggregative E. coli had combined with Shiga-toxin production from enterohemorrhagic E. coli. “So now you have bacteria that binds tight and produces toxin, which changes the dynamics of the disease presentation,” explains David Rasko, associate professor of microbiology and immunology at the University of Maryland, and first author of the paper describing the findings. The authors concluded that “horizontal genetic exchange … facilitates the emergence of new pathogens” [1].

How does a sequence help?

There are different ways to determine the pathogen responsible for an illness. A battery of serologic-, culture- and microscopy-based examinations are often the first tests to be run, to look for the usual suspects. And generally speaking, a staphylococcus infection or strep throat will be treated by standard protocols without needing to know the variant of the subspecies involved. “It’s extremely rare that you would go all the way to sequencing, in the clinic,” Rasko says.

In the public-health realm—except in extraordinary instances—whole genome sequencing (WGS) mainly has been limited to situations such as food-borne outbreaks and influenza, with limited use in characterizing tuberculosis and viral-hepatitis outbreaks, says Wroblewski. Targeted sequencing may be used in some cases, such as to determine whether a person with measles “is sick with the vaccine strain or the wild type, because vaccine can cause mild illness in some people.” She explains that “we’re really in a transition period where we’re figuring out where the best applications of WGS are and how to use resources most efficiently: where to abandon old technologies and move exclusively to WGS, and where targeted sequencing will do the trick. Just because you can sequence the entire genome doesn’t mean you need to.”

On occasions when the standard therapy doesn’t do the trick, or more information is necessary for public-health or other reasons, some sort of sequencing may be called for. For example, treating the Shiga-toxin producing bacteria from the European outbreak with the antibiotic Cipro caused an increase in toxin expression of about 83 fold. “So from a clinical standpoint it’s important to know what’s going on with the construction of the bacteria, so you can adjust treatment outcomes appropriately,” Rasko explains.

A genome sequence can answer many questions at once, Fraser says. What properties—such as antibiotic resistance or virulence—might the pathogen have? What is the isolate? Is it something we know about, or is it brand new? Knowing what strains an isolate most resembles may provide insight into its origination: the “index case” of an airborne disease, for example, or the source of a foodborne illness. She notes that the strains of cholera isolated in Haiti following the 2010 earthquake were very similar to those historically seen in parts of south Asia, and so “the simplest explanation was that the cholera outbreak in Haiti probably came with a relief worker from that part of the world.”

Then and now

For the past two decades, the CDC’s PulseNet has used DNA fingerprinting—random-primed PCR followed by pulsed-field gel electrophoresis—to define food-borne illness outbreaks and match the culprits to known isolates in a vast database of gel images. “If you have 100% concordance on all of the bands, then you can assume that what you’re looking at is the same; if the bands are identical in all but a couple, you know that’s probably closely related,” Fraser says. There are equivalent surveillance programs for non-food-borne outbreaks as well, explains Wroblewski. These programs gradually are transitioning to more sequencing-based approaches, keeping in mind that not all labs are equipped with sophisticated sequencing capabilities.

Sequencing itself has evolved: “Before truly high-throughput sequencing—before the Illumina platforms that we have now—when we were using Sanger sequencing to look at the 16S [ribosomal RNA gene] profile of a community, it was significantly labor-intensive to actually do the cloning, to make sure you had somewhat equal representation across your community and then to do the actual sequencing,” notes Rasko. “So even within the sequencing era, we’ve changed from doing relatively shallow sampling to now, where we have high-throughput machines that are capable of doing much deeper sampling. I think that gets us to the point of being able to identify the more rare events on a much more regular basis. Not all organisms, when they become pathogenic, become the dominant species.”

To make up for the 1,000 base-pair reads that Sanger sequencing afforded (if at a much lower throughput), Fraser has taken to using a Pacific Biosystems (PacBio) instrument’s longer reads “as a scaffold to help us assemble the shorter Illumina reads, which are enormously helpful in putting a complete bacterial chromosome together into a single ordered and oriented contig.”

Even used in combination, these sequencing technologies won’t give you a perfect answer, but for most applications that doesn’t matter. “You don’t need to determine every single base pair with a high degree of certainty,” Fraser notes. One exception? The work she did with the Federal Bureau of Investigation (FBI) in the Amerithrax investigation of anthrax sent through the mail. For forensics, she says, “we needed to be as sure as we possibly could that the genome sequences that we were completing were correct at every single base pair.”

Reference

[1] Rasko, DA, et al., “Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany,” N Engl J Med, 365:709-172011, 2011. [PubMed ID: 21793740]

 

Image: Shutterstock

  • <<
  • >>

Join the discussion