Events
Project Match
FDS Data Science Project Match
Monday, December 8, 2025
3:00PM - 4:00PM
Snacks and conversations to follow in 1307
Location: Yale Institute for Foundations of Data Science, Kline Tower 13th Floor, Room 1327, New Haven, CT 06511 and via Webcast: https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=7c2eea03-86f9-4088-8380-b398014d9d6e
|
Speaker: Soheil Ghili (SOM) Associate Professor (Quantitative Marketing Group) Yale School of Management Talk Title: Market Design for Agentic Commerce: Pay-Per-Crawl Pricing for the AI Economy |
|
Speaker: Leandros Tassiulas (SEAS) John C. Malone Professor of Electrical & Computer Engineering Yale Engineering Talk Title: An AI assistant for Regulatory Insight and Data-Driven Analysis in the Power Industry (AIReD) Presented by Ibrahim Ibne Alam, Postdoctoral Associate We propose a next-generation AI assistant for Regulatory Insight and Data-Driven Analysis (AIReD) aimed to support a wide range of power-industry stakeholders, including consumers, generators, system operators, and management. The agent combines retrieval-augmented generation (RAG) for transparent, context-aware answers with tool-calling for advanced analytical tasks such as statistical evaluation, forecasting, and optimization. The project has completed a comprehensive landscape review of existing power-domain LLM efforts, evaluated baseline model performance on ElecBench with and without LoRA adaptation, curated key regulatory documents for domain grounding, and implemented a working RAG prototype. An ML model for EV charging-demand prediction is currently being explored as part of the envisioned tool-calling framework. By integrating citation-based knowledge access with computational reasoning, AIReD aims to become a reliable, domain-grounded AI assistant that strengthens decision-making and operational efficiency across the power sector. |
|
Speaker: Brian Macdonald (Yale) Senior Lecturer and Research Scientist Statistics and Data Science Yale University Talk Title: Sports and Environmental Data Science Projects We will discuss projects in two areas, environmental data science and sports, many of which involve working with industry partners. One project focuses on working with an industry expert in large-scale lithium ion battery energy storage systems (BESS) to model and simulate thermal runaway and ensuing fires in BESS. In another project, the goal is to develop an R package that streamlines the process of incorporating active learning into the process of manually labeling landcover types for remote-sensing data projects. The sports analytics lab at Yale has a variety of opportunities, including using spatiotemporal bat and pitch tracking data to analyze batter-pitcher interactions (with the Boston Red Sox MLB team); analyzing the value of draft picks (with the New York Liberty WNBA team); developing a Stuff+ metric for evaluating pitches and pitchers (with the Yale baseball team’s pitching coach); using data to understand why horse racing is losing participation (owners) despite higher earnings, and find improvements to make horse racing more appealing and sustainable for owners in the long run (with the Jockey Club and Joseph Appelbaum, Yale ’90); various projects in lacrosse analytics (with the Yale lacrosse team); and projects with the United States Olympic and Paralympic Committee (several possible Olympic sports).
|
|
Speaker: Purushottam Dixit (SEAS) Assistant Professor of Biomedical Engineering Yale Engineering Talk Title: Quantifying dimensionality of microbiomes Many high-dimensional biological systems, from microbial communities to gene expression dynamics, appear complicated on the surface but often evolve on low-dimensional manifolds governed by latent ecological or biophysical constraints. Our group is developing new statistical tools to quantify this effective dimensionality directly from data and to identify signatures of low-dimensional organization in mechanistic models, specifically in microbiomes. We want you to analyze real and simulated time-series datasets, test algorithms for dimensionality estimation, and compare theoretical models against empirical patterns. The project is ideal for students interested in statistics, machine learning, or dynamical systems; no biology background is required. You’ll be working at a conceptual frontier: figuring out when and why messy biological systems actually behave in surprisingly simple, structured ways.
|
|
Speaker: Mark Gerstein (Yale) Albert L Williams Professor of Biomedical Informatics and Professor of Molecular Biophysics & Biochemistry, of Computer Science, and of Statistics & Data Science Yale University Talk Title: Genomics & Bioinformatics Research Opportunities in the Gerstein Lab Presented by Joel Rozowsky The Gerstein lab conducts computational biology & bioinformatics research in the biomedical and genomic fields. We use various computational analytics methods including artificial-intelligence / machine-learning techniques to analyze large biomedical datasets and develop bioinformatics tools. The lab has particular focuses on the following areas of research: neurogenomics, personal genomes, genomic privacy and genome annotation. |
|
Speaker: Zongming Ma (Yale) Professor, Statistics and Data Science Yale University Talk Title: Data Integration in Spatial and Single-cell Biology Spatial and single-cell technologies have generated massive datasets through consortia-level efforts. However, these data remain fragmented by differences in biological modalities, disease conditions, spatial resolutions, and technology-induced artifacts. To tame this “Wild West” of data from cutting-edge biotech sensors, our goal is to leverage advanced machine learning and AI techniques to develop rigorous, scalable integration methods and standardize them into a unified framework for ingesting, processing, and harmonizing diverse spatial omics data.
Seeking students with interest in:
|
|
Speaker: Phillip Atiba Solomon (fka Goff) (Yale) Carl I. Hovland Professor of Black Studies and Professor of Psychology Co-Founder & CEO, Center for Policing Equity Yale University Talk Title: A Model-Based National Estimate of Police Use-of-Force In the United States, police regularly use force against civilians. Much of the prior research in this area has focused on lethal force, but much less is known about the vast majority of incidents that do not result in death. This project aims to generate the first national estimate of the annual number of use-of-force incidents in the US overall and by race/ethnicity. Leveraging newly available use-of-force data for thousands of agencies and a national dataset of predictors, we use Bayesian and machine learning approaches estimating use of force counts for the entire US. |
|
Speaker: Meg Urry (Yale) Israel Munson Professor of Physics Yale University Talk Title: Finding Merging Galaxies and Supermassive Black Holes in Large Astronomical Surveys We have data covering large areas of the sky, which needs to be scanned automatically to find candidates for merging galaxies, dual AGN (Active Galactic Nuclei, which is when both black holes in the merging galaxies are growing rapidly and thus shining brightly), and all combinations in between. Right now, we’ve trained a first-generation CNN that does the job okay, but at a minimum, we need to update it to incorporate multiple images taken at different wavelengths, to query the full data set (right now, we start with known AGN and look for a companion), and to distinguish stars from AGN. More ambitious goals: Is there a better approach than training a CNN (we use a large number of simulated images and then a more limited number of bona fide cases)? Can we train it more efficiently and/or improve the code in other ways? |
The FDS Data Science Project Match, hosted by the Yale Institute for Foundations of Data Science (FDS), is an opportunity for Yale faculty from any department or school within the university to connect with talented students from the departments of Statistics and Data Science, Applied Mathematics, and Computer Science. In a series of lightning-round talks, faculty will have exactly five minutes to pitch a current research problem, aiming to team up with students interested in tackling complex data challenges. This event facilitates collaboration on current research projects, offering a platform for faculty to present their data-driven initiatives and find skilled undergraduate and/or graduate students eager to contribute. It’s also a wonderful way to learn about the research of many Yale faculty.
Add To: Google Calendar | Outlook | iCal File
- Project Match
Submit an Event
Interested in creating your own event, or have an event to share? Please fill the form if you’d like to send us an event you’d like to have added to the calendar.
