Yihong Wu

Attwood Professor of Statistics and Data Science

Yihong Wu

Contact Information

Yihong Wu, whose work lies at the intersection of high-dimensional statistics, information theory, and computer science, was recently appointed the James A. Attwood Professor of Statistics and Data Science. He is a member of Yale’s Faculty of Arts and Sciences (FAS) in the Department of Statistics and Data Science, where he serves as department chair. Wu joined the Yale faculty in July 2016, having previously held faculty positions at the University of Illinois at Urbana-Champaign, ENSAE ParisTech, and fellowships at the University of Pennsylvania and the University of California, Berkeley. By developing new approaches to the use of algorithms, Wu identifies communities and networks within high-dimensional data. This work has made transformative contributions to the field of data science. He has developed new techniques of data matching amplification by estimating and predicting the information in the unseen data. This research — which Wu has published in the major journals in his field — has implications for understanding voting records, DNA chains, online behavior, traffic patterns, and an array of other data. Wu is also the co-author of the forthcoming book “Information Theory: From Coding to Learning” (Cambridge University Press, 2024), which will offer the most comprehensive explication of modern information theory and its applications in data science and computer science to date. He was named a Research Professor at the Simons Laufer Mathematical Sciences Institute in 2025, a Fellow of the Institute of Mathematical Statistics in 2023, an Alfred P. Sloan Research Fellow in Mathematics in 2018, and a Simons-Berkeley Research Fellow in 2015. He has been awarded a National Science Foundation (NSF) CAREER grant and the Marconi Society Young Scholar Award. His work has been supported by multiple grants from the NSF. These honors testify to the impact of Wu’s work in the fields of statistics and data science, mathematics, and beyond. He currently serves as associate editor of The Annals of Statistics and previously served as associate editor for Statistical Science. He has organized workshops at the Simons Institute for the Theory of Computing, the Banff International Research Station, and MIT, and he has given invited talks at Stanford, Rutgers, Google, Caltech, Princeton, and other institutions around the world. At Yale, he offers courses in high-dimensional statistics, probability theory, information theory, and stochastic processes. His doctoral advisees have gone on to positions at Princeton, Cornell, and in industry. Wu earned his Ph.D. from Princeton and his B.E. from Tsinghua University.

What do you do with Data Science?

I am broadly interested in the theoretical and algorithmic aspects of high-dimensional statistics, information theory, and optimization. My current research interests are in understanding the statistical and computational limits of statistical problems arising in large-scale inference and combinatorial settings. Some representative work: Zhou Fan, Cheng Mao, Yihong Wu, and Jiaming Xu, "Spectral Graph Matching and Regularized Quadratic Relaxations I: The Gaussian Model, II: Erdős-Rényi Graphs and Universality", Foundations of Computational Mathematics, Jun 2022. Yihong Wu, Jiaming Xu and Sophie H. Yu, "Testing correlation of unlabeled random graphs", The Annals of Applied Probability, 2022+. Cheng Mao and Yihong Wu, "Learning Mixtures of Permutations: Groups of Pairwise Comparisons and Combinatorial Method of Moments", The Annals of Statistics, 2022+. Yihong Wu and Harrison H. Zhou, "Randomly initialized EM algorithm for two-component Gaussian mixture achieves near optimality in O(sqrt{n}) iterations", Mathematical Statistics and Learning, 4 (2021), 143–220. Yanjun Han, Soham Jana, and Yihong Wu, "Optimal prediction of Markov chains with and without spectral gap", Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS), 2021. Vivek Bagaria, Jian Ding, David Tse, Yihong Wu, and Jiaming Xu, "Hidden Hamiltonian Cycle Recovery via Linear Programming", Operations Research, volume 68, number 1, Jan 2020. Yihong Wu and Pengkun Yang, "Optimal estimation of Gaussian mixtures via denoised method of moments", The Annals of Statistics, vol. 48, no. 4, pp. 1981-2007, 2020. Alon Orlitsky, Ananda Theertha Suresh, and Yihong Wu, "Optimal prediction of the number of unseen species", Proceedings of the National Academy of Sciences, vol. 113, no. 47, pp. 13283–13288, Nov 2016.

Edit profile