Researchers from Harvard and Northwestern have developed a machine learning approach that designs intrinsically disordered proteins (IDPs) with custom properties. Intrinsically disordered proteins never settle into a fixed shape and are constantly shifting around.  They are key to many biological functions including cross-linking molecules, sensing or signaling, but have been difficult to predict or design using current artificial intelligence methods like AlphaFold because of their constantly shifting conformations. 

This new method applies a computational technique known as automatic differentiation, which traditionally supports deep learning by calculating exact derivatives instantaneously. By using automatic differentiation combined with molecular dynamics simulations, the researchers optimize protein sequences based on how small changes affect protein properties in real time. This gradient-based optimization framework acts like a powerful search engine for amino acid sequences, identifying those that fulfill desired dynamic behaviors without relying on vast datasets or training typical machine learning models.

Ryan Krueger, first author of the paper published in Nature Computational Science, explained that their aim was to use “existing, sufficiently accurate simulations to be able to design proteins at the level of those simulations” rather than relying on the data-heavy approaches common in AI-based protein design. The proteins designed through this method are “differentiable,” meaning their properties derive directly from physics-based molecular dynamics simulations rather than predictions alone. This provides a closer representation of natural protein behavior.

Search Antibodies
Search Now Use our Antibody Search Tool to find the right antibody for your research. Filter
by Type, Application, Reactivity, Host, Clonality, Conjugate/Tag, and Isotype.

Beyond advancing fundamental understanding, this approach opens doors to designing synthetic proteins for biotechnology and therapeutics by rationally controlling sequence-ensemble-function relationships. The combination of gradient-based optimization and physics-based modeling provides an efficient and precise pathway for exploring a previously inaccessible class of proteins.