Xiang Zhou, PhD

Professor of Statistics and Data Science

Xiang Zhou

Contact Information

Xiang Zhou is a Professor in the Department of Statistics and Data Science at Yale University. He earned a BS in Biology from Peking University in 2004, followed by an MS in Statistics (2009) and a PhD in Neurobiology (2010) from Duke University. He completed postdoctoral training in the Department of Human Genetics at the University of Chicago (2010–2013), where he later served as the Williams H. Kruskal Instructor in the Department of Statistics (2013–2014). Dr. Zhou joined the Department of Biostatistics at the University of Michigan as an Assistant Professor in 2014. He held the John G. Searle Assistant Professorship from 2018 to 2019 and was promoted to Associate Professor in 2019 and to full Professor in 2023. He served as Assistant Director of Precision Health (2022–2025) and, in 2025, became Assistant Director of Artificial Intelligence and Digital Health Innovation (AI&DHI). He joined Yale University in 2025. Dr. Zhou is a Fellow of the American Statistical Association and the recipient of the 2024 Mid-career Biosciences Faculty Achievement Recognition (MBioFAR) Award and the 2025 ICIBM Eminent Scholar Award. He is a standing member of the NIH MRAA Study Section and serves as an Associate Editor for PLOS Genetics, Journal of the American Statistical Association, and Annals of Applied Statistics. In 2024, he was Program Chair for the Section on Statistics in Genomics and Genetics of the American Statistical Association. His research centers on genomic data science, with a focus on developing advanced statistical and machine learning methods, including deep learning and AI tools, for the analysis of large-scale, high-dimensional genetic and genomic data. His work spans a range of application areas, including genome-wide association studies, single-cell sequencing, and spatial multi-omics.

What do you do with Data Science?

Our research centers on genomic data science, where we develop cutting-edge statistical and machine learning (ML) methods, including deep learning and artificial intelligence (DL/AI) tools, to empower the effective analysis of large-scale, high-dimensional genetic and genomic studies. Key methodological areas include mixed-effects models, spatial statistics, causal inference, mediation analysis, Bayesian methods including nonparametrics, kernel methods, graphical models, mixture models, continuous and discrete latent variable models, Poisson process and Gaussian process models, integrative modeling, statistical computing, approximate and scalable inference, and, more recently, deep learning. Key application areas include genome-wide association studies (GWAS), transcriptome wide association studies (TWAS), molecular quantitative trait loci (QTL) mapping studies such as expression QTL (eQTL) and methylation QTL (mQTL) mapping studies, and various functional genomic studies such as chromatin immunoprecipitation sequencing (ChIPseq), bulk RNA sequencing (RNAseq), single cell RNAseq (scRNAseq), bisulfite sequencing (BSseq), and, more recently, spatial omics studies. By developing novel analytic methods for state-of-the-art genetic and genomic techniques, our goal is to extract key biological insights from these data, advancing our understanding of how genomic variation influences biological functions and contributes to phenotypic variation in various human diseases and disease related complex traits.

Edit profile