A new tool for nonstatisticians that automatically generates models for analyzing raw data has been developed. According to MIT scientists, the sophisticated statistical models created resemble those typically used by experts to analyze, interpret, and predict underlying patterns in data.

The tool currently lives on Jupyter Notebook, an open-source web framework that allows users to run programs interactively in their browsers. Users need only write a few lines of code to uncover insights into, for instance, financial trends, voting patterns, the spread of disease, and other trends.

In a paper presented at this week's ACM SIGPLAN Symposium on Principles of Programming Languages, the researchers show their tool can accurately extract patterns and make predictions from real-world datasets, and even outperform manually constructed models in certain data-analytics tasks.

Subscribe to eNewsletters
Get the latest industry news and technology updates
related to your research interests.

"The high-level goal is making data science accessible to people who are not experts in statistics," says first author Feras Saad. "People have a lot of datasets that are sitting around, and our goal is to build systems that let people automatically get models they can use to ask questions about that data."