Michael Kane is an Assistant Professor of Biostatistics at Yale University. He received his Ph.D. and MA in Statistics at Yale University and his MA in Electrical Engineering and BS in Computer Engineering at Rochester Institute of Technology. His current research in machine learning focuses on constructing and characterizing Hilbert-space representations of data for un-, self-, semi-, and supervised learning challenges in data science. His current application areas include understanding patient heterogeneity in clinical and observation trials, the development of prognostic biomarkers in accelerometry data, and the characterization of population-scale human mobility patterns using cell phone data. He has been the PI for awards from the NSF and DARPA and has industrial collaborations with Amgen, Takeda, and AT&T Labs Research.
What do you do with data science?
My interest in data science is focused on the intersection of machine learning methods and their implementations in the biosciences. Models constructed in this area are operationalized through collaborations with domain experts and depends on their ability to understand models and their limitations as they are put into practice. As a result, my work can be distinguished from more traditional machine learning research in that prediction, while extremely important, is not always paramount. Methods development these areas must incorporate interpretability as well as the prediction uncertainty to be considered viable. The implementation of these methods settings requires that we draw on software engineering (architecture and design, reproducible research, etc.) so that they can be applied to real-world settings.
Kane, Michael, John W. Emerson, and Stephen Weston. “”Scalable strategies for computing with massive data.”” Journal of Statistical Software 55 (2013): 1-19.
Gilani, Owais, Simon Urbanek, and Michael J. Kane. “”Distributions of Human Exposure to Ozone During Commuting Hours in Connecticut Using the Cellular Device Network.”” Journal of Agricultural, Biological and Environmental Statistics 25, no. 1 (2020): 54-73.
Kane, Michael, Xun Tony Jiang, and Simon Urbanek. “”On the Programmatic Generation of Reproducible Documents.”” Journal of Statistical Software 103 (2022): 1-15.