This Event has Passed
Colloquium

Understanding AI systems by understanding their training data: Memorization, generalization, and points in between

Speaker: Tom McCoy (Yale)

Assistant Professor of Linguistics

Yale University

Wednesday, September 24, 2025

11:30AM - 1:00PM

Lunch at 11:30am in 1307
Talk 12:00-1:00pm in 1327

Location: Yale Institute for Foundations of Data Science, Kline Tower 13th Floor, Room 1327, New Haven, CT 06511 and via Webcast: https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=3390c929-3fc3-4e3d-86dc-b35b00f766ec

Abstract: Large language models (LLMs) can perform a wide range of tasks impressively well. To what extent are these abilities driven by shallow heuristics vs. deeper abstractions? I will argue that, to answer this question, we must view LLMs through the lens of generalization. That is, we should consider the data that LLMs were trained on so that we can identify whether and how their abilities go beyond their training data. In the analyses of LLMs that I will discuss, this perspective reveals both impressive strengths and surprising limitations. For instance, LLMs often produce sentence structures that are well-formed but that never appeared in their training data, yet they also struggle on some seemingly simple algorithmic tasks (e.g., decoding simple ciphers) in ways that are well-explained by training data statistics. In sum, to understand what AI systems are, we must understand what we have trained them to be.

Speaker bio: Tom McCoy is an Assistant Professor of Linguistics at Yale University, with a secondary appointment in Computer Science. His research aims to bridge the divide between linguistics and artificial intelligence: how can we create AI systems that replicate the rapid learning and robust generalization that humans display when processing language? Much of this work involves analyzing the performance and internal processing of neural network language models. He received his PhD from the Department of Cognitive Science at Johns Hopkins, and his PhD thesis received a Glushko Dissertation Prize from the Cognitive Science Society. He then did a postdoc in Computer Science at Princeton before joining the faculty at Yale. Outside of research, he is an organizer and problem writer for NACLO, a contest that introduces high school students to linguistics and natural language processing.

Website.

Add To: Google Calendar | Outlook | iCal File

  • Colloquium

Submit an Event

Interested in creating your own event, or have an event to share? Please fill the form if you’d like to send us an event you’d like to have added to the calendar.

Submit an Event

Share your event ideas with us using the form below.

"*" indicates required fields

MM slash DD slash YYYY
Start Time*
:
End Time*
: