Studying a biological system with a single modality is like attempting to understand a puzzle by looking at a single piece. These complex systems require a combination of assessments to even begin to comprehend their processes and state. The importance of this approach has been recognized by life science researchers, and multiomics tools for data generation and analysis have proliferated in recent years.

What is multiomics data?

“Multiomics data, at its core, refers to multimodal data derived from a single sample,” explained Telma Santos, Ph.D, Strategic Portfolio Expert for Imaging Software at Miltenyi Biotec. This involves integrating two or more omic datasets, whether it be genomics, transcriptomics, proteomics, epigenomics, lipidomics, or other modalities. Each of these data types has historically been analyzed separately, noted Matt Newman, Senior Vice President and General Manager of Pharma and Diagnostics at DNAnexus. But now with the rise of integrative multiomics, Newman explained that “researchers are able to identify biomarkers, understand biological pathways, and answer complex questions.”

Why multiomics?

The appeal of multiomics lies in its ability to uncover relationships between different molecular layers that couldn’t be determined using one modality alone. This scenario is particularly true in cases where two related modalities show unexpected discordance. “People have studied RNA for a long time to make predictions about proteins,” noted Kristopher Nazor, CEO of Proteintech Genomics. “But those predictions are not always right.” For instance, Nazor pointed out that cells with high RNA levels may lack the corresponding proteins and vice versa, a case often seen in response elements and signaling cascades. Additionally, post-translational modifications (PTMs), which significantly influence protein function, cannot be measured or predicted effectively using RNA alone. “It's very important that you collect these modalities and analyze them together," Nazor emphasized.

These approaches are also fundamental to advancing precision health. Newman explained that incorporating various data types and even electronic health records is key to better understanding disease mechanisms. As Newman stated, multiomics is essential for “providing the most effective therapeutic treatment for a specific cohort of individuals.”

Tools and frameworks

Multiomics research relies on a range of tools and frameworks to support both data generation and downstream analysis. These include assays that capture different omic layers, platforms designed for integrating diverse datasets, and software solutions that enable researchers to uncover meaningful biological insights. Among these tools, Nazor highlighted the power of Proteintech’s human discovery panel, a single-cell assay designed to work with 10x Genomics’ Flex chemistry. The panel targets 350 markers, including cell surface proteins, transcription factors, and PTMs.

In spatial multiomics, Santos shared insights into the MACSima™ platform, which integrates proteomics and transcriptomics from the same tissue section. Its advanced software ensures precise alignment, cellular segmentation, and feature extraction. Santos noted that this platform enables a variety of scientific applications from a single tissue section, including “cross-validation of upstream screening methods, detection of secreted factors like chemokines or cytokines, RNA-protein co-expression analysis, and signaling pathway analysis.” While spatial multiomics remains in its exploratory phase, requiring expertise that bridges individual omics modalities, Santos emphasized her team’s commitment to understanding user needs and creating tools that streamline workflows and enhance efficiency.

Search Spatial biology platforms
Search Now Search our directory to find the right spatial biology platform for your research needs.

For data analysis frameworks, Newman highlighted the capabilities of the DNAnexus platform. The process begins with harmonizing metadata to facilitate integration across various omic data types. Next, DNAnexus employs a range of algorithms to enable cross-layer analyses with publicly available tools such as MOFA and MDI. Newman also pointed to newer algorithms, like scGPT, which show potential for application across multiple omic modalities. “The general strategy is to use more than one of these tools at a time to find the best results,” he explained.

Overcoming integration challenges

Despite its potential, multiomics presents various analytical challenges. According to Newman, the greatest challenge is figuring out how to make full use of the data. “It’s easy to revert to standard analyses for each omic type, but leveraging all of the knowledge around the data is more of a challenge.” With the rapid progress in omics, Newman noted that the DNAnexus platform has been custom-built to adapt and handle various data types. “This customization helps with some of the standard challenges, including harmonization of metadata and the ability to leverage large-scale computing of both CPUs and GPUs,” he added.

Nazor acknowledged that integrating multiomics data faces challenges similar to early bulk sequencing efforts, such as batch effects, technical artifacts, and differences in sequencing depth or sample collection. These issues are further complicated at the single-cell level, particularly when integrating large datasets collected across multiple labs over time. He noted that while single-experiment analyses are relatively straightforward with popular tools like Loupe Browser, scanpy, or Seurat, harmonizing large-scale data and ensuring meaningful comparisons between labs is much more difficult. Nazor emphasized the utility of deep generative models like scVI and its derivative TotalVI, which allow the integration of RNA and protein data into a single model.1,2 These tools allow users to cluster cells based on joint variability in RNA and protein, enhancing the ability to make comprehensive comparisons and integrate large amounts of data from diverse sources effectively.

Real-world applications

Recent life science publications are filled with applications that exemplify the importance of multiomics research. One such study from the Grimes' lab used CITE-seq to explore the impact of GFI1 mutations in severe congenital neutropenia.3 The integration of these data types allowed the researchers to distinguish how subtle variations in marker intensities, such as CD11B and Ly6g, correlate with distinct cell states during neutrophil development. These shifts, often overlooked as technical artifacts in traditional analyses, demonstrate biologically meaningful gradients similar to developmental processes like the Sonic Hedgehog pathway. Nazor explained that only by having the multiomic data could the researchers say whether this variation was significant or meaningless.

Another notable example comes from an innovative study involving spatial proteogenomics.4 “This research utilized multiple omics modalities to validate key discoveries in hepatic cells, aiming to enhance diagnostic approaches for patients in the future,” shared Santos. Even though the omics data were derived from different samples, this study demonstrated the power of integrating multimodal data to identify distinct and evolutionarily conserved hepatic macrophage niches. This comprehensive atlas has improved our understanding of hepatic cell organization and microenvironmental interactions in both health and disease.

A recent study in Scientific Reports demonstrated how multiomics approaches resolved the complex regulatory networks of programmed cell death in hepatocellular carcinoma.5 Newman shared that this understanding led to insights into patient stratification strategies for targeted therapies, offering potential improvements in treatment outcomes through precision health approaches. Additionally, a Nature Communications study employed multiomics techniques, including scRNA-seq, scATAC-seq, and histological characterizations, to investigate the negative impact of the ENL-T1 mutation on kidney development.6 This integration of data revealed that this mutation disrupts development by rewiring gene regulation, impairing progenitor differentiation, and inducing lineage-specific transcriptional and chromatin changes.

The future of multiomics

All three experts expressed optimism about the future of multiomics research. Newman specifically emphasized the importance of cloud technology and artificial intelligence as transformative tools for advancing data integration and accelerating breakthroughs in biomedical research. Cloud technology is already enabling secure, scalable sharing of large datasets, while also fostering collaboration across the scientific community. Meanwhile, new AI algorithms are more powerful than ever and have the capability to self-train, generating new rules and discoveries. Newman believes these two technologies “will help exponentially increase our understanding of disease mechanisms, organ development, and delivery of therapeutic or prophylactic treatments tailored to specific populations.”

The integration of spatial biology data with generative AI, Santos noted, also holds immense promise for advancing multiomics research by tackling challenges like cost, time, and dataset complexity. Expanding analyzable modalities within spatial samples and using AI-driven methods will enable deeper insights into cellular interactions. “Generative AI is expected to revolutionize data integration by addressing the growing volume and complexity of spatial multiomics datasets,” Santos stated.

In addition to AI-based analysis, Nazor emphasized the potential of new technologies like single-cell proteomic and transcriptomic methods. These approaches enable unbiased, high-throughput experiments, and eliminate the need to pre-select targets, thereby enhancing discovery potential. “Discovery really happens when we stop looking for the answers that we expect to find,” stated Nazor.

References

1. Lopez R, Regier J, Cole MB, Jordan MI, Yosef N. Deep generative modeling for single-cell transcriptomics. Nat Methods. 2018;15(12):1053-1058. 

2. Gayoso A, Steier Z, Lopez R, et al. Joint probabilistic modeling of single-cell multi-omic data with totalVI. Nat Methods. 2021;18(3):272-282. 

3. Muench DE, Olsson A, Ferchen K, et al. Mouse models of neutropenia reveal progenitor-stage-specific defects. Nature. 2020;582(7810):109-114. 

4. Guilliams M, Bonnardel J, Haest B, et al. Spatial proteogenomics reveals distinct and evolutionarily conserved hepatic macrophage niches. Cell. 2022;185(2):379-396.e38. 

5. Chen L, Hu Y, Li Y, et al. Integrated multiomics analysis identified comprehensive crosstalk between diverse programmed cell death patterns and novel molecular subtypes in Hepatocellular Carcinoma. Sci Rep. 2024;14(1):27529. Published 2024 Nov 11. 

6. Song L, Li Q, Xia L, et al. Single-cell multiomics reveals ENL mutation perturbs kidney developmental trajectory by rewiring gene regulatory landscape. Nat Commun. 2024;15(1):5937. Published 2024 Jul 15.