My research aims to build reliable machine learning (ML) systems using tools from information theory and coding theory. As ML systems are getting bigger, faster, and impacting more people, their reliability is challenged on many fronts. I summarize the reliability challenges of ML into two categories:
- Reliability in machine metrics: Large-scale ML computations utilize thousands of machines, often consisting of diverse components (e.g. GPUs, accelerators, edge devices). The scale and heterogeneity of these systems give rise to more unreliable computation time and accuracy.
- Reliability in human metrics: When ML is applied to make decisions on people, their performance in machine metrics is not sufficient to earn trust. We have to consider human aspects such as fairness, inclusivity, and accountability. An algorithm solely optimized for machine metrics can catastrophically fail in human metrics.
Information theory is the foundational discipline that enabled reliable wireless communication systems. My research adapts and reinvents information-theoretic concepts for the context of reliable large-scale ML:
- To achieve reliability in machine metrics, I work with systems experts and develop “coded computing” techniques for large-scale distributed algorithms that are resilient to unresponsive or slow nodes.
- To achieve reliability in human metrics, I closely collaborate with social scientists to investigate the fairness of ML algorithms and develop discrimination mitigation strategies that can be used in practical ML pipelines.
- H. Jeong, M. D. Wu, M. Médard, N. Dasgupta, and F. P. Calmon “Who Gets the Benefit of the Doubt? Racial Bias inMachine Learning Algorithms Applied to Secondary School Math Education”, NeurIPS 2021 Workshop on Math AI for Education (2021)
- H. Jeong, H. Wang, and F. Calmon, “Fairness without Imputation: A Decision Tree Approach for Fair Prediction with Missing Values”, https://arxiv.org/abs/2109.10431. (2021)
- H. Jeong, V. Cadambe, F. Calmon, and A. Devulapalli, “ε-Approximate Coded Matrix Multiplication is NearlyTwice as Efficient as Exact Multiplication”, JSAIT Special Issue on Coded Computing (2021)
- H. Jeong, Y. Yang, C. Engelmann, V. Gupta, T.M. Low, P.Grover, V. Cadambe, and K. Ramchandran, “3D Coded SUMMA: Communication-Efficient and Robust Parallel Matrix Multiplication” (Euro-par 2020)