Artificial intelligence (AI) has permeated our daily lives, but the trustworthiness of AI systems is not a one-sided concern. According to Peter Koo from Cold Spring Harbor Laboratory (CSHL) scientists using popular computational tools to interpret AI predictions are picking up too much “noise,” or extra information, when analyzing DNA. In a paper published in Genome Biology last month, Koo and his team have developed a solution that promises to rectify this issue. With just a few additional lines of code, scientists can now obtain more reliable explanations from deep neural networks, enabling them to uncover essential DNA features that may lead to breakthroughs in health and medicine. The key lies in reducing the noise that obscures the signals.

The origins of this noise can be likened to a digital equivalent of "dark matter"—an invisible and mysterious source. In the realm of physics and astronomy, dark matter is believed to comprise a significant portion of the universe, exerting gravitational effects despite eluding direct observation. Similarly, Koo's team discovered that the data used to train AI models lacks critical information, leading to significant blind spots. Consequently, these blind spots influence the interpretation of AI predictions related to DNA function, introducing unwanted noise into the analysis process. Koo's research demonstrates that this problem affects numerous prominent AI models across a wide range of applications.

Search Antibodies
Search Now Use our Antibody Search Tool to find the right antibody for your research. Filter
by Type, Application, Reactivity, Host, Clonality, Conjugate/Tag, and Isotype.

The crux of the issue lies in the fact that scientists have borrowed computational techniques from computer vision AI, which is primarily designed for processing image data. Unlike images, DNA data is confined to a combination of four nucleotide letters (A, C, G, T), lacking the continuous nature of pixels. Consequently, when AI systems are fed with DNA data, they struggle to handle it appropriately, resulting in the introduction of noise during analysis. By integrating Koo's computational correction, scientists can achieve a more accurate interpretation of AI-generated DNA analyses.

Koo's approach brings clarity to the DNA analysis landscape, rendering sites of interest clearer and reducing spurious noise in other regions. Nucleotides previously deemed critical may be seen as less important, thereby providing researchers with a more refined understanding of DNA signals. Koo believes that the problem of noise disturbance extends beyond AI-powered DNA analyzers, affecting a broader range of computational processes dealing with similar types of data.