Whether it’s flushing after a drink, reacting to certain smells, or metabolizing drugs differently, many of our biological quirks trace back to differences in our DNA. While all humans share the same genes, their specific combinations of sequence variations—called gene types—define how those genes function in individuals. Traditional sequencing tools could only assess short stretches of DNA, providing isolated fragments of data. Long-read sequencing has now extended that reach, connecting these fragments into continuous lines that reveal how variations interact to form functional gene types.
Yet, a persistent challenge has been the lack of a universal system to describe these gene types. “It is like explaining a cup by only listing the shape of its handle, its color, or other separate features. It creates barriers to cross-study comparison and slows translation into healthcare,” says Professor Masao Nagasaki, first author of the new study published in Nucleic Acids Research. Different research fields have developed their own naming conventions—for example, in transplant matching or drug metabolism—but none has achieved broad acceptance.
To close this gap, Nagasaki and his team established the ACTG hierarchical nomenclature and built the Joint Open Genome and Omics Platform 1.0 (JoGo 1.0). The project spanned nearly five years, including about two and a half years dedicated to constructing the database.
Search Antibodies Search Now Use our Antibody Search Tool to find the right antibody for your research. Filter
by Type, Application, Reactivity, Host, Clonality, Conjugate/Tag, and Isotype.
The ACTG system draws its structure from the four DNA bases. It classifies gene types across four levels: A (amino acid sequence), C (coding sequence), T (transcript level, including untranslated regions), and G (the full gene body). “One key feature is that we rank gene types based on global frequency,” Nagasaki explains. For instance, the primary version of the ALDH2 gene, which helps break down acetaldehyde, is labeled ALDH2:a1c1t1g1. A less active variant, common in East Asian populations and known for causing facial flushing after alcohol consumption, appears as ALDH2:a2.
JoGo 1.0 integrates data from 258 genomes representing all five inhabited continents, combining public resources with new sequences from the 1000 Genomes Project. Its interactive and local viewers enable secure analysis of 4.7 million gene types across more than 19,000 genes, linking these to major databases such as ClinVar, the GWAS Catalog, and GTEx. As Nagasaki notes, “Having consistent names for whole genes means we can finally speak a common language. Just as there is active research and discussion around blood types today, I hope this new nomenclature will lead to a deeper understanding of, and public dialogue around, human gene types.”