Integrating Statistics with AI
Professor Arka Saha’s research integrates the theory and practice of artificial intelligence (AI) and machine learning (ML) with statistics. “AI/ML methods often forego heavy model assumptions of classical statistics in favor of a model-free data-driven approach,” he says. “Though this lends scalability and flexibility to the AI/ML methods, they frequently neglect the data’s inherent structure.” Classical statistics has long been used to model these domain-specific structures, with scientists’ domain expertise as a foundation. “I combine the power of statistical modeling with the scalability and flexibility of AI/ML, bringing together the best of both worlds.”
Understanding Data Dependence
AI/ML models often ignore the data’s dependent structure, assuming independence implicitly or explicitly. However, this dependence is frequently of crucial importance; failing to address it may result in poor efficacy, and the structure itself may be of scientific significance. Professor Saha turns such challenges into assets by explicitly modeling the dependence using statistical tools. “Using the knowledge that the observations or features are dependent, I ‘borrow strength’ across the rows or columns of the data, increasing the power and accuracy of the AI/ML approaches while also providing insight into the dependence structure itself.” He applies this concept to integrate spatial and temporal models into a machine learning framework, which is a key element of his methodological research.
Collaborating for Real-World Impact
Professor Saha focuses on collaborating with scientists to solve open scientific challenges by merging AI/ML approaches with domain expertise via statistics. “My research paradigm on data dependence is of fundamental interest in environmental science, biomedical sciences, oceanography, finance, data privacy, and algorithmic fairness,” he says. “I also work with earth system scientists to evaluate the level of carbon in oceans. This allows us to better understand, forecast, and address a critical component of global environmental change, which can aid in developing policies for a more sustainable future.”
“I combine the power of statistical modeling with the scalability and flexibility of AI/ML, bringing together the best of both worlds.”
Assistant Professor
arkajyos@uci.edu
DBH 2228
website