Transforming Data Interaction with NSF Grant
The National Science Foundation has granted Illinois Institute of Technology researchers $4 million to explore a new vision in managing the increasing complexity and scale of data in modern scientific pursuits.
Xian-He Sun, Ron Hochsprung Endowed Chair of Computer Science, and , assistant research professor of computer science, are receiving the bulk of a $5 million NSF grant to pursue . IOWarp aims to reduce the amount of data that needs to be transferred through optimization techniques and data transformation, as well as provide a unified platform that can handle a wide range of data sources and formats, simplifying data management for scientists.
鈥淚鈥檓 particularly energized by how IOWarp isn鈥檛 just another generic data platform,鈥 Kougkas says. 鈥淚t鈥檚 been carefully designed through direct collaboration with scientists across diverse fields such as materials science, cosmology, and biomedicine. This means we鈥檙e not just building technology in isolation. We鈥檙e creating solutions that directly address the complex challenges scientists face in their daily work.鈥
IOWarp is a direct evolution of Hermes, a multi-tiered distributed input/output (I/O) buffering system that was previously developed by Sun, Kougkas, and their research team. IOWarp incorporates the key features from Hermes while adding new functionalities and optimizations to handle the complexities of modern scientific workflows, especially those incorporating artificial intelligence.
鈥淭his is the third multi-million-dollar NSF grant that our group has received over the last six years,鈥 Sun says. 鈥淔rom the Hermes data management and transfer systems for high performance computing, to the ChronoLog data systems for cloud computing, to the current IOWarp system for AI applications, we have extended our research horizon from foundation to application and established our leading position in the nation. This award is a great recognition for us and for Illinois Tech.鈥
The new system has the potential to ease data management in three main ways. It addresses challenges in managing diverse data types and formats that are required across different workflow stages of modern scientific pursuits. It also aims to reduce the amount of data transferred through various mechanisms, including tiered content organization and content operators, and it also supports a wide variety of data sources.
鈥淲hen I think about how IOWarp could accelerate breakthrough discoveries in fields ranging from atmospheric science to biomedical research by streamlining how researchers work with complex datasets, it鈥檚 hard not to be enthusiastic about the impact this could have on scientific progress as a whole,鈥 Kougkas says.
IOWarp features a new natural language interface driven by WarpGPT, a suite of AI technologies being developed by the team to assist scientists in exploring data dynamics using natural language, the ultimate interface.
WarpGPT makes complex analyses and explorations as easy as asking a question, which democratizes data access and analysis by reducing coding barriers, unlocking complex insights, automating data management, and creating a more transparent and reproducible way to document data analysis steps compared to complex code.
鈥淚OWarp鈥檚 natural language interface is not just a tool鈥攊t鈥檚 a vision for the future of scientific data management,鈥 Kougkas says. 鈥淏y enabling scientists to interact with their data in a way that feels natural and intuitive, it empowers them to unlock the full potential of their research and accelerate discoveries that benefit us all.鈥
Another key differentiator is IOWarp鈥檚 novel data representation termed 鈥渃ontent,鈥 which acts like a universal adapter to streamline and simplify data management. It does this by taking data from different sources, such as complex scientific instruments or simulations, and transforms it into a standardized format that any application can understand. It also reduces data bottlenecks and allows researchers to ask complex questions in natural language rather than code to make it easier for AI to understand and extract valuable insights from the data.
鈥淭he biggest hurdle for the IOWarp team is convincing the scientific community to adopt this new platform, especially since many researchers already rely on established data management solutions,鈥 Kougkas says. 鈥淭he team needs to showcase IOWarp鈥檚 advantages and directly address concerns that researchers may have about things such as compatibility with their current systems, the time it takes to learn a new platform, and how easily IOWarp can mesh with their existing tools.鈥
He says a key part of this challenge lies in building a strong and active open-source community around IOWarp. A vibrant community will drive the project鈥檚 ongoing development, ensure it stays up to date with technological advances, and keeps it relevant in the fast-paced world of scientific computing.
鈥淎s someone passionate about advancing scientific discovery, I see tremendous potential in how IOWarp can help scientists spend less time wrestling with data management and more time focusing on their core research questions,鈥 Kougkas says. 鈥淭he open-source nature of the project adds another layer of excitement鈥攊t means we鈥檙e not just building a tool but fostering a community that can continuously evolve and improve the platform.鈥
Sun and Kougkas are working with researchers at the University of Utah on IOWarp, as well as The HDF Group, which will help build the software.
You can check out the team鈥檚 , follow , or join the .
Disclaimer: Research reported in this publication is supported by the National Science Foundation under Award Number 2411318. This content is solely the responsibility of the authors and does not necessarily represent the official views of the National Science Foundation.