A team of researchers from MIT has published a comprehensive map of noncoding DNA. The map, which appears in Nature today, provides in-depth annotation of epigenomic marks—modifications indicating which genes are turned on or off in different types of cells—across 833 tissues and cell types. The researchers also identified groups of regulatory elements that control specific biological programs, and they uncovered candidate mechanisms of action for about 30,000 genetic variants linked to 540 specific traits.
"What we're delivering is really the circuitry of the human genome. Twenty years later, we not only have the genes, we not only have the noncoding annotations, but we have the modules, the upstream regulators, the downstream targets, the disease variants, and the interpretation of these disease variants," says Manolis Kellis, senior author of the study.
Search Antibodies Search Now Use our Antibody Search Tool to find the right antibody for your research. Filter
by Type, Application, Reactivity, Host, Clonality, Conjugate/Tag, and Isotype.
The new map, EpiMap (Epigenome Integration across Multiple Annotation Projects), builds on and combines data from several large-scale mapping consortia, including ENCODE, Roadmap Epigenomics, and Genomics of Gene Regulation. The 833 biosamples represent diverse tissues and cell types, each of which was mapped with a slightly different subset of epigenomic marks, making it difficult to fully integrate data across the multiple consortia. The team then filled in the missing datasets, by combining available data for similar marks and biosamples, and used the resulting compendium of 10,000 marks across 833 biosamples to study gene regulation and human disease.
The researchers annotated more than 2 million enhancer sites, covering only 0.8 percent of each biosample, and collectively 13 percent of the genome. They grouped them into 300 modules based on their activity patterns, and linked them to the biological processes they control, the regulators that control them, and the short sequence motifs that mediate this control. The researchers also predicted 3.3 million links between control elements and the genes that they target based on their coordinated activity patterns, representing the most complete circuitry of the human genome to date.
The researchers have made all of their data publicly available for the broader scientific community to use.