Boris Glavic Receives NSF Grant for Data Science Research

Date

Boris Glavic, assistant professor of computer science, along with collaborators from the University of Buffalo and New York University, has received almost $405,000 as a subcontractor on a $2.7 million grant from the National Science Foundation (NSF) for a project that aims to make data cleaning and wrangling easier. The Vizier project is part of an NSF effort to develop software for data exploration, cleaning, curation, and visualization.

In the current world, data is ubiquitous. There are sensors in phones, watches, homes, roads, and factories. Open government regulations put statistics like health code violations and legislative decision-making within reach of an average person. Today, this data is used by doctors, sociologists, business owners, and even ordinary citizens trying to improve their communities. However, using such data to answer simple questions like 鈥淲here do police issue the most traffic tickets?鈥 or 鈥淲hat am I doing when my heart rate goes over 90 bpm?鈥 is still hard. The data might be available, but this does not imply that it is fit for use. It may exhibit errors, inconsistencies, and other data quality problems. Data errors are everywhere, and need to be resolved to ensure that analysis results are correct. In corporate settings, analysts will spend days, weeks, or even months 鈥渃leaning鈥 their data even before asking a single question.

The Vizier project will streamline the data curation process, making it easier and faster to explore and analyze raw data. The tool used in the project, Vizier, will combine a simple 鈥渘otebook-style鈥 interface with powerful back-end tools that track changes, edits, and the effects of automation. These forms of 鈥減rovenance鈥 capture the exploratory curation process鈥攈ow the cleaning workflows evolve and how data changes over time.

To learn more about the Vizier project, visit the website.

Illinois Tech and Glavic are listed as subcontractors and are receiving $404,979