BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//7.2.3.1//EN
TZID:America/New_York
X-WR-TIMEZONE:America/New_York
BEGIN:VEVENT
UID:644@fds.yale.edu
DTSTART;TZID=America/New_York:20241028T160000
DTEND;TZID=America/New_York:20241028T170000
DTSTAMP:20250916T142141Z
URL:https://fds.yale.edu/events/sds-seminar-florentina-bunea-cornell/
SUMMARY:S&amp\;DS Seminar: Florentina Bunea (Cornell)\, "Learning Large Sof
 tmax Mixtures with Warm Start EM"
DESCRIPTION:\nMixed multinomial logits are discrete mixtures introduced sev
 eral decades ago to model the probability of choosing an attribute xj 2 R
 L from p possible candidates\, in heteroge-neous populations. The model ha
 s recently attracted attention in the AI literature\, under the name soft
 max mixtures\, where it is routinely used in the nal layer of a neural net
 work to map a large number p of vectors in RL to a probability vector. De
 spite its wide applicability and empirical success\, statistically optimal
  estimators of the mixture parameters\, obtained via algorithms whose run
 ning time scales polynomially in L\, are not known. This paper provides a
  solution to this problem for contemporary applications\, such as LLMs (L
 arge Language Models)\, in which the mixture has a large number p of suppo
 rt points\, and the size N of the sample observed from the mixture is als
 o large. Our proposed estimator combines two classical estimators\, obtain
 ed respectively via a method of moments (MoM) and the expectation-minimiza
 tion (EM) algorithm. Although both estimator types have been studied\, fr
 om a theoretical perspective\, for Gaussian mixtures\, no similar results
  exist for softmax mixtures for either procedure. We develop a new MoM pa
 rameter estimator based on latent moment estimation that is tailored to ou
 r model\, and provide the rst theoretical analysis for a MoM-based proced
 ure in softmax mixtures. Although consistent\, as N\; p ! 1\, MoM for so
 ftmax mixtures can exhibit poor numerical performance\, an empirical obser
 vation that is in line with those made for other mixture models. Neverthel
 ess\, as MoM is provably in a neighborhood of the target\, it can be used
  as warm start for any iterative algorithm. We study in detail the EM alg
 orithm\, and provide its rst theoretical analysis for softmax mixtures\, e
 xtending the only other class of similar results\, valid for Gaussian mix
 tures. Our nal proposal for parameter estimation is the EM algorithm with
  a MoM warm start. In addition to leading to the desired parametric estim
 ation rates\, this combined procedure provides computational savings rela
 tive to the standard practice of selecting one of the outputs of multiple
  EM runs\, each initialized at random. These facts are supported by our s
 imulation studies. Concrete examples that substantiate the large applicab
 ility of the model will be given throughout the talk. \n\n\n\n3:30pm - Pr
 e-talk meet and greet teatime - 219 Prospect Street\, 13 floor\, there wil
 l be light snacks and beverages in the kitchen area.\n\n\n\nBio: Florentin
 a Bunea is a Professor in the Department of Statistics and Data Science at
  Cornell University\, where she is also a member of the Graduate Fields of
  Statistics\, Applied Mathematics\, and Computer Science. As a member of t
 he Diversity and Inclusion Council of the Bowers College of Computing and 
 Information Science\, she is dedicated to promoting diversity within data 
 science disciplines.\n\n\n\nProfessor Bunea's research spans statistical m
 achine learning theory and high-dimensional statistical inference\, with a
  focus on developing new methodologies and sharp theoretical insights for 
 addressing a range of data science challenges. Her recent projects include
  estimation and theory for soft-max mixtures to deepen the understanding o
 f large language models (LLMs) and AI algorithms\, optimal transport for h
 igh-dimensional mixture distributions\, and inference for the Wasserstein 
 distance in topic models. She is also working on high-dimensional latent-s
 pace clustering\, cluster-based inference\, network modeling\, and latent 
 structure inference in high-dimensional models.\n\n\n\nHer research intere
 sts extend to model selection\, sparsity\, and dimension reduction\, with 
 applications in fields such as genetics\, systems immunology\, neuroscienc
 e\, sociology\, and economics. Professor Bunea's work is supported by the 
 National Science Foundation (NSF-DMS). She is a Fellow of the Institute of
  Mathematical Statistics (IMS) and a recipient of the prestigious IMS Meda
 llion Award. She has served as an Associate Editor for leading statistical
  journals\, including Annals of Statistics\, Bernoulli\, JASA\, JRSS-B\, a
 nd EJS\, and is a co-editor for the Chapman and Hall Statistics and Applie
 d Probability Monograph Series.\n\n\n\nWebsite\n
CATEGORIES:FDS Events,Statistics &amp; Data Science Seminar
LOCATION:Yale Institute for Foundations of Data Science\, Kline Tower 13th 
 Floor\, Room 1327\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ro
 om 1327\, New Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITL
 E=Yale Institute for Foundations of Data Science:geo:0,0
END:VEVENT
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
DTSTART:20240310T030000
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
END:DAYLIGHT
END:VTIMEZONE
END:VCALENDAR