BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//7.2.3.1//EN
TZID:America/New_York
X-WR-TIMEZONE:America/New_York
BEGIN:VEVENT
UID:855@fds.yale.edu
DTSTART;TZID=America/New_York:20250331T153000
DTEND;TZID=America/New_York:20250331T170000
DTSTAMP:20250916T142149Z
URL:https://fds.yale.edu/events/sds-seminar-zhuoran-yang-yale-unveiling-in
 -context-learning-provable-training-dynamics-and-feature-learning-in-trans
 formers/
SUMMARY:S&amp\;DS Seminar: Zhuoran Yang (Yale)\, "Unveiling In-Context Lear
 ning: Provable Training Dynamics and Feature Learning in Transformers"
DESCRIPTION:\nAbstract: In-context learning (ICL) is a cornerstone of larg
 e language model (LLM) functionality\, yet its theoretical foundations rem
 ain elusive due to the complexity of transformer architectures. In particu
 lar\, most existing work only theoretically explains how the attention mec
 hanism facilitates ICL under certain data models. It remains unclear how t
 he other building blocks of the transformer contribute to ICL. To address 
 this question\, we study how a simple softmax transformer is trained to pe
 rform ICL on two synthetic tasks — (multi-task) linear regression and n-
 gram Markov chain. We show that transformer successfully learns these task
 s in-context. More importantly\, we will interpret the estimator represent
 ed by the learned transformer\, show how transformers are trained by gradi
 ent-based dynamics\, and how features emerge during training. Our theory i
 s further validated by experiments.\n\n\n\nThis is joint work with Siyu Ch
 en\, Jianliang He\, Xintian Pan\, Heejune Sheen\, and Tianhao Wang.\n\n\n\
 nSpeaker bio: Zhuoran Yang is an Assistant Professor of Statistics and Dat
 a Science and Computer Science at Yale University. He is also affiliated w
 ith the Yale Institute for Foundations of Data Science and the Center for 
 Algorithms\, Data\, and Market Design (CADMY) at Yale. His research lies a
 t the intersection of machine learning\, statistics\, game theory\, and op
 timization.\n\n\n\nYang's recent work focuses on the foundations of reinfo
 rcement learning\, particularly in multi-agent systems where agents intera
 ct strategically. Additionally\, he explores the foundations of artificial
  intelligence\, investigating the emergent behaviors of large language mod
 els during pre-training and post-training and their relationship to model 
 architecture. His research is supported by NSF DMS 2413243.\n\n\n\nBefore 
 joining Yale\, Yang was a postdoctoral researcher at the University of Cal
 ifornia\, Berkeley\, under the mentorship of Michael I. Jordan. He earned 
 his Ph.D. in Operations Research and Financial Engineering from Princeton 
 University\, co-advised by Jianqing Fan and Han Liu. He completed his bach
 elor's degree in Mathematics at Tsinghua University in 2015.\n\n\n\nWebsit
 e: https://zhuoranyang.github.io/ \n
CATEGORIES:FDS Events,Statistics &amp; Data Science Seminar
END:VEVENT
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
DTSTART:20250309T030000
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
END:DAYLIGHT
END:VTIMEZONE
END:VCALENDAR