BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//6.4.8//EN
TZID:America/New_York
X-WR-TIMEZONE:America/New_York
BEGIN:VEVENT
UID:432@fds.yale.edu
DTSTART;TZID=America/New_York:20221109T160000
DTEND;TZID=America/New_York:20221109T180000
DTSTAMP:20240621T160701Z
URL:https://fds.yale.edu/events/fds-seminar-priya-panda-exploring-robustne
ss-and-energy-efficiency-in-neural-systems-with-spike-based-machine-intell
igence/
SUMMARY:FDS Seminar: Priya Panda (Department of Electrical Engineering)"Exp
loring Robustness and Energy–Efficiency in Neural Systems with Spike–b
ased Machine Intelligence"
DESCRIPTION:\nAbstract: Spiking Neural Networks (SNNs) have recently emerge
d as an alternative to deep learning due to their huge energy efficiency b
enefits on neuromorphic hardware. In this presentation\, I will talk about
important techniques for training SNNs which bring a huge benefit in term
s of latency\, accuracy\, interpretability\, and robustness. We will first
delve into how training is performed in SNNs. Training SNNs with surrogat
e gradients presents computational benefits due to short latency. However\
, due to the non-differentiable nature of spiking neurons\, the training b
ecomes problematic and surrogate methods have thus been limited to shallow
networks. To address this training issue with surrogate gradients\, we wi
ll go over a recently proposed method Batch Normalization Through Time (BN
TT) that allows us to train SNNs from scratch with very low latency and en
ables us to target interesting applications like video segmentation and be
yond traditional learning scenarios\, like federated training. Another cri
tical limitation of SNNs is the lack of interpretability. While a consider
able amount of attention has been given to optimizing SNNs\, the developme
nt of explainability still is at its infancy. I will talk about our recent
work on a bio-plausible visualization tool for SNNs\, called Spike Activa
tion Map (SAM) compatible with BNTT training. The proposed SAM highlights
spikes having short inter-spike interval\, containing discriminative infor
mation for classification. Finally\, with proposed BNTT and SAM\, I will h
ighlight the robustness aspect of SNNs with respect to adversarial attacks
. In the end\, I will talk about interesting prospects of SNNs for non-con
ventional learning scenarios such as privacy-preserving distributed learni
ng as well as unraveling the temporal correlation in SNNs with feedback co
nnections. Finally\, time permitting\, I will talk about the prospects of
SNNs for novel and emerging compute-in-memory hardware that can potentiall
y yield order of magnitude lower power consumption than conventional CPUs/
GPUs.\n\n\n\nBio: Priya's research interests lie in Neuromorphic Computing
: spanning energy-efficient design methodologies for deep learning network
s\, novel supervised/unsupervised learning algorithms for spiking neural n
etworks and developing neural architectures for new computing scenarios (s
uch as lifelong learning\, generative models\, stochastic networks\, adver
sarial attacks etc.).\n\n\n\nHer goal is to empower energy-aware and energ
y-efficient machine intelligence through algorithm-hardware co-design whil
e being secure to adversarial scenarios and catering to the resource const
raints of Internet of Things (IoT) devices.\n\n\n\nWebsite: https://seas.y
ale.edu/faculty-research/faculty-directory/priya-panda\n\n\n\n\nWatch (Acc
ess to Yale network required)\n\n
CATEGORIES:Seminar Series
LOCATION:https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=f2008
6eb-012d-4ead-a949-af1c01268d6d
END:VEVENT
BEGIN:VEVENT
UID:433@fds.yale.edu
DTSTART;TZID=America/New_York:20221122T160000
DTEND;TZID=America/New_York:20221122T180000
DTSTAMP:20240621T161201Z
URL:https://fds.yale.edu/events/fds-seminar-michael-lopez-sr-nfl-analyzing
-the-national-football-league-is-challenging-but-player-tracking-data-is-h
ere-to-help/
SUMMARY:FDS Seminar: Michael Lopez Sr. (NFL) "Analyzing the National Footba
ll League is challenging\, but player tracking data is here to help"
DESCRIPTION:\nAbstract: Most historical National Football League (NFL) anal
ysis\, both mainstream and academic\, has relied on play-by-play data to g
enerate team and player-level trends. Given the number of outside variable
s that impact on-field results\, such as play call and game situation\, fi
ndings are often no more than interesting anecdotes. With the release of p
layer tracking data\, however\, analysts can appropriately ask and answer
questions that better isolate player skill and coaching strategy. In this
talk\, we highlight the limitations of traditional analyses\, and use a de
cades-old punching bag for analysts – fourth-down strategy – as a micr
ocosm for why tracking data is needed.\n\n\n\n\nView Webinar\n\n
CATEGORIES:Seminar Series
LOCATION:https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=2adcd
8f2-132c-41ca-abb3-af3900d8fe0d
END:VEVENT
BEGIN:VEVENT
UID:431@fds.yale.edu
DTSTART;TZID=America/New_York:20221216T130000
DTEND;TZID=America/New_York:20221216T140000
DTSTAMP:20240226T190412Z
URL:https://fds.yale.edu/events/fds-seminar-yuchen-wu-stanford-university/
SUMMARY:FDS Seminar: Yuchen Wu (Stanford University)
DESCRIPTION:Fundamental Limits of Low-Rank Matrix Estimation: Information-T
heoretic and Computational Perspectives\n\n\n\nAbstract: Many statistical
estimation problems can be reduced to the reconstruction of a low-rank n×
d matrix when observed through a noisy channel. While tremendous positive
results have been established\, relatively few works focus on understandin
g the fundamental limitations of the proposed models and algorithms. Under
standing such limitations not only provides practitioners with guidance on
algorithm selection\, but also spurs the development of cutting-edge meth
odologies. In this talk\, I will present some recent progress in this dire
ction from two perspectives in the context of low-rank matrix estimation.
From an information-theoretic perspective\, I will give an exact character
ization of the limiting minimum estimation error. Our results apply to the
high-dimensional regime n\,d→∞ and d/n→∞ (or d/n→0) and general
ize earlier works that focus on the proportional asymptotics n\,d→∞\,
d/n→δ∈(0\,∞). From an algorithmic perspective\, large-dimensional m
atrices are often processed by iterative algorithms like power iteration a
nd gradient descent\, thus encouraging the pursuit of understanding the fu
ndamental limits of these approaches. We introduce a class of general firs
t order methods (GFOM)\, which is broad enough to include the aforemention
ed algorithms and many others. I will describe the asymptotic behavior of
any GFOM\, and provide a sharp characterization of the optimal error achie
ved by the GFOM class.This is based on joint works with Michael Celentano
and Andrea Montanari.\n\n\n\nThis seminar was held virtually over zoom and
a recording is not available.\n
CATEGORIES:Postdoctoral Applicants,Seminar Series
LOCATION:Webcast\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Webcast:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:427@fds.yale.edu
DTSTART;TZID=America/New_York:20221216T150000
DTEND;TZID=America/New_York:20221216T160000
DTSTAMP:20240226T190411Z
URL:https://fds.yale.edu/events/fds-seminar-aditi-laddha-ga-tech/
SUMMARY:FDS Seminar: Aditi Laddha (GA Tech)
DESCRIPTION:"High-Dimensional Markov Chains and Applications"\n\n\n\nAbstra
ct: A Markov chain is a random process in which the next state is chosen a
ccording to some probability distribution that depends only on the current
state. In a high-dimensional setting\, Markov chains are essential tools
for understanding the geometry of the space and form the backbone of many
efficient randomized algorithms for tasks like optimization\, integration\
, linear programming\, approximate counting\, etc. In this talk\, I will p
rovide an overview of my research on “High-Dimensional Markov Chains\,
” with a focus on the geometric aspects of the chains. I will describe t
wo results that illustrate the importance of Markov chains for designing e
fficient algorithms. First\, I will discuss my work on a barrier-based ran
dom walk for bounding the discrepancy of set systems. I will then present
a general framework for bounding discrepancy in various settings. Second\,
I will describe two Markov chains\, the Weighted Dikin Walk and Coordinat
e Hit-and-Run for sampling convex bodies\, and discuss new techniques for
bounding their convergence rates.\n\n\n\nThis seminar was held virtually o
ver zoom and no recording is available.\n
CATEGORIES:Postdoctoral Applicants,Seminar Series
LOCATION:Webcast\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Webcast:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:426@fds.yale.edu
DTSTART;TZID=America/New_York:20221219T123000
DTEND;TZID=America/New_York:20221219T133000
DTSTAMP:20240226T190410Z
URL:https://fds.yale.edu/events/fds-seminar-alkis-kalavasis-national-techn
ical-university-of-athens/
SUMMARY:FDS Seminar: Alkis Kalavasis (National Technical University of Athe
ns)
DESCRIPTION:"Efficient Algorithms and Computational Barriers in Reliable Ma
chine Learning"Speaker: Alkis Kalavasis\nNational Technical University of
Athens\nAbstract: In this talk\, we will discuss the computational challen
ges arising in various problems in Reliable Machine Learning. Reliable ML
aims at the design of computationally efficient algorithms that provide gu
arantees such as robustness to biased data\, reproducibility and privacy.
We firstly focus on the design of algorithms robust to biased and corrupte
d observations. We begin with the problem of learning from coarse data. Th
e motivation behind this problem is that in many learning tasks one may no
t have access to fine grained label information\; e.g.\, an image can be l
abeled as husky\, dog\, or even animal depending on the expertise of the a
nnotator. We formalize these settings from the viewpoint of computational
learning theory and provide efficient algorithms and computational hardnes
s results. We then continue with the task of learning noisy linear label r
ankings. Label ranking is the supervised task of learning a sorting functi
on that maps feature vectors to rankings over a finite set of labels. We p
rovide the first efficient algorithms for learning linear sorting function
s in the presence of bounded noise (an extension of the Massart noise cond
ition to label rankings) under Gaussian marginals. Next\, we consider ques
tions regarding responsibility aspects of ML systems. We study the importa
nt problem of reproducibility as an algorithmic property in decision-makin
g settings. We introduce the notion of reproducible policies in the contex
t of stochastic bandits\, one of the canonical problems in interactive lea
rning. A policy in the bandit environment is called reproducible if it pul
ls\, with high probability\, the exact same sequence of arms in two differ
ent and independent executions (under independent reward realizations and
shared internal randomness). We show that not only do reproducible policie
s exist\, but also they achieve almost the same optimal (non-reproducible)
regret bounds in terms of the time horizon. In the end of the talk\, we w
ill shortly discuss some on-going work on the complexity of min-max optimi
zation\, a fundamental problem in the area of equilibrium computation in m
ulti-agent environments.
CATEGORIES:Postdoctoral Applicants
LOCATION:Webcast\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Webcast:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:430@fds.yale.edu
DTSTART;TZID=America/New_York:20230112T150000
DTEND;TZID=America/New_York:20230112T160000
DTSTAMP:20240226T190411Z
URL:https://fds.yale.edu/events/fds-seminar-arnab-auddy-columbia/
SUMMARY:FDS Seminar: Arnab Auddy (Columbia)
DESCRIPTION:"Statistical Benefits and Computational Challenges of Tensor Sp
ectral Learning"\n\n\n\nTalk Abstract:Given multivariate observations from
a statistical model\, tensors are a natural way of recording higher order
interactions among variables. Tensor spectral learning is a collection of
methods wherein we aim to decompose a tensor into its components\, each o
f which correspond to interpretable features of the model. This approach h
as recently received a lot of attention for its application to latent vari
able models. In this talk\, I will focus on orthogonally decomposable tens
ors\, which arise naturally in many problems. These tensors have a decompo
sition that can be interpreted very similarly to matrix SVD\, but automati
cally provides much better identifiability properties than their matrix co
unterparts. I will show that in such a tensor decomposition\, a small pert
urbation affects each singular vector in isolation\, and their estimatibil
ity does not depend on the gap between consecutive singular values. In con
trast to these attractive statistical properties\, in general\, tensor met
hods present us with intriguing computational considerations. I will illus
trate these phenomena in the particular application to a spiked tensor PCA
problem and in Independent Component Analysis (ICA). Interestingly there
is a gap within the information theoretic and computationally tractable li
mits of both problems. Above the computational threshold\, we provide nois
e robust algorithms based on spectral truncation\, which provide rate opti
mal estimators. Our estimators are also asymptotically normal thus allowin
g confidence interval construction. Finally I will present some examples d
emonstrating our theoretical findings.\n\n\n\nThis talk was held virtually
on January 12\, 2023 @ 3:00 pm\n
CATEGORIES:Postdoctoral Applicants
LOCATION:Webcast\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Webcast:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:428@fds.yale.edu
DTSTART;TZID=America/New_York:20230113T130000
DTEND;TZID=America/New_York:20230113T140000
DTSTAMP:20240226T190411Z
URL:https://fds.yale.edu/events/fds-seminar-gaurav-mahajan-ucsd/
SUMMARY:FDS Seminar: Gaurav Mahajan (UCSD)
DESCRIPTION:“Computational-Statistical Gaps in Reinforcement Learning”\
n\n\n\nSpeaker: Gaurav Mahajan (UCSD)\n\n\n\nAbstract: A fundamental assum
ption in theory of reinforcement learning is "RL with linear function appr
oximation". Under this assumption\, the optimal value function (either Q*\
, or V*\, or both) can be obtained as the linear combination of finitely m
any known basis functions. Even though it was observed as early as 1963 th
at there are empirical benefits of using linear function approximation\, o
nly recently a series of work designed sample efficient algorithms for thi
s setting. These works posed an important open problem: Can we design poly
nomial time algorithms for this setting? In this talk\, I will go over
recent progress on this open problem: a surprising computational-statistic
al gap in reinforcement learning. Even though we have polynomial sample co
mplexity algorithms\, under standard hardness assumption (NP != RP) there
are no polynomial time algorithms for this setting. I will start by going
over a few algorithmic ideas for designing sample efficient algorithms in
RL and then move on to show how to build hard MDPs which satisfy linear fu
nction approximation assumption from hard 3-SAT instances. I will end the
talk by discussing a few open problems in RL and sequence modelling.\n\n\n
\nRemote presentation only.\n\n\n\nJoin from PC\, Mac\, Linux\, iOS or And
roid: https://yale.zoom.us/j/94359913798Or Telephone：203-432-9666 (2-ZO
OM if on-campus) or 646 568 7788One Tap Mobile: +12034329666\,\,9435991379
8# US (Bridgeport)\n\n\n\nMeeting ID: 943 5991 3798International numbers a
vailable: https://yale.zoom.us/u/ac1Gq3KLWp\n\n\n\nWebcast\n\n\n\n\n
CATEGORIES:Postdoctoral Applicants
END:VEVENT
BEGIN:VEVENT
UID:429@fds.yale.edu
DTSTART;TZID=America/New_York:20230113T150000
DTEND;TZID=America/New_York:20230113T160000
DTSTAMP:20240226T190411Z
URL:https://fds.yale.edu/events/fds-seminar-ming-yin-ucsb/
SUMMARY:FDS Seminar: Ming Yin (UCSB)
DESCRIPTION:\n\n\n“Instance-Adaptive and Optimal Offline Reinforcement Le
arning” \nSpeaker: Ming Yin (UCSB) \nAbstract: Reinforcement Learning is
becoming the mainstay of sequential decision-making problems. In particul
ar\, offline reinforcement learning is considered the central framework fo
r real-life applications when online interactions are not permitted. This
talk will expose the main challenges for offline RL (including distributio
n shift\, the curse of the horizon\, and the suboptimal data) and offer ou
r solutions on how to bypass them. I will discuss how to improve the sampl
e efficiency using various techniques and show how they adapt to the hardn
ess of individual problems. I will also briefly discuss the connection bet
ween these methodologies and their extensions to more general settings.\nR
emote presentation only.\nJoin from PC\, Mac\, Linux\, iOS or Android: htt
ps://yale.zoom.us/j/95770019076Or Telephone：203-432-9666 (2-ZOOM if on-c
ampus) or 646 568 7788 One Tap Mobile: +12034329666\,\,95770019076# US (Br
idgeport)\nMeeting ID: 957 7001 9076International numbers available: https
://yale.zoom.us/u/adTjb3rkTu
CATEGORIES:Postdoctoral Applicants
LOCATION:Webcast\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Webcast:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:425@fds.yale.edu
DTSTART;TZID=America/New_York:20230119T110000
DTEND;TZID=America/New_York:20230119T120000
DTSTAMP:20240226T190410Z
URL:https://fds.yale.edu/events/sds-seminar-edward-de-brouwer-ku-leuven/
SUMMARY:S&\;DS Seminar: Edward De Brouwer (KU Leuven)
DESCRIPTION:"Predicting the impact of treatments over time with uncertainty
aware neural differential equations"\n\n\n\nSpeaker: Edward De Brouwer (K
U Leuven) \n\n\n\nTalk Abstract: Predicting the impact of interventions in
the real world from observational data alone represents a major statistic
al challenge. Indeed\, treatment assignments are usually correlated with t
he predictors of the response\, resulting in a lack of data support for co
unterfactual predictions and therefore in poor quality estimates. Developm
ents in causal inference have lead to methods addressing this confounding
by requiring a minimum level of overlap. However\, overlap is difficult to
assess and usually not satisfied in practice. In this work\, we propose t
o circumvent the overlap assumption by predicting the impact of treatments
continuously over time using neural ordinary differential equations equip
ped with uncertainty estimates.\n\n\n\nThis presentation was held virtuall
y on January 19\, 2023 @ 11:00 AM\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
END:VEVENT
BEGIN:VEVENT
UID:424@fds.yale.edu
DTSTART;TZID=America/New_York:20230130T160000
DTEND;TZID=America/New_York:20230130T170000
DTSTAMP:20240228T053626Z
URL:https://fds.yale.edu/?post_type=event&p=2083
SUMMARY:S&\;DS Seminar: Ruishun Liu (Stanford)
DESCRIPTION:\n\n\n\nMachine learning for precision medicineSpeaker: Ruishun
Liu\, Postdoctoral Researcher\, Stanford UniversityAbstract: Predicting t
he impact of interventions in the real world from observational data alone
represents a major statistical challenge. Indeed\, treatment assignments
are usually correlated with the predictors of the response\, resulting in
a lack of data support for counterfactual predictions and therefore in poo
r quality estimates. Developments in causal inference have lead to methods
addressing this confounding by requiring a minimum level of overlap. Howe
ver\, overlap is difficult to assess and usually not satisfied in practice
. In this work\, we propose to circumvent the overlap assumption by predic
ting the impact of treatments continuously over time using neural ordinary
differential equations equipped with uncertainty estimates.Speaker Bio: R
uishan Liu is a postdoctoral researcher in Biomedical Data Science at Stan
ford University\, working with Prof. James Zou. She received her PhD in El
ectrical Engineering at Stanford University in 2022. Her research lies in
the intersection of machine learning and applications in human diseases\,
health and genomics. She was the recipient of Stanford Graduate Fellowship
\, and was selected as the Rising Star in Data Science by University of Ch
icago\, and the Rising Star in Engineering in Health by Johns Hopkins Univ
ersity and Columbia University. She led the project Trial Pathfinder\, whi
ch was selected as Top Ten Clinical Research Achievement in 2022 and Final
ist for Global Pharma Award in 2021.Monday\, January 30\, 20233:30pm - Pre
-talk meet and greet teatime - Dana House\, 24 Hillhouse Avenue4:00 pm - 5
:00pm - Talk - Mason Lab 211\, 9 Hillhouse AvenueIn-Person seminars will b
e held at Mason Lab 211\, 9 Hillhouse Avenue with the option of virtual pa
rticipation.\nWatch\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
END:VEVENT
BEGIN:VEVENT
UID:423@fds.yale.edu
DTSTART;TZID=America/New_York:20230131T130000
DTEND;TZID=America/New_York:20230131T140000
DTSTAMP:20240228T053626Z
URL:https://fds.yale.edu/?post_type=event&p=2082
SUMMARY:Special Seminar: Dr. Maria Rodriguez Martinez (IBM Research) "Inter
pretable deep learning for cancer personalized medicine"
DESCRIPTION:\n\n\n\n"Interpretable Deep Learning for Cancer Personalized Me
dicine"Speaker: Maria Rodriguez Martinez\, PhDGroup Leader of Computationa
l Systems Biology IBM Research\, Zurich\, SwitzerlandHosted by:John Tsang\
, Director of the Center for Systems and Engineering Immunology\, Professo
r of Immunobiology and Biomedical EngineeringTalk Abstract:In recent years
\, deep learning models have resulted in outstanding breakthrough performa
nces. However\, many models behave as black boxes that can hide data biase
s\, incorrect hypotheses or even software errors. In this talk\, I will il
lustrate how interpretable deep learning models can achieve both high pred
iction accuracy and transparency. First\, I will introduce multi-modal dee
p learning models that predict drug response while highlighting the geneti
c and chemical patterns that were more informative to make a prediction. I
will also discuss how reinforcement learning approaches can facilitate th
e early phases of drug discovery and support the personalized design of ne
w candidate compounds. Focusing next on T cell-based immunotherapies\, I w
ill present a model to predict the binding of T cell receptors and epitope
s. This model can be coupled with an easy-to-use interpretable pipeline to
extract the binding rules governing the T cell binding. These approaches
are a first step towards the design and engineering of receptors of improv
ed affinity. Finally\, I will discuss how the integration of AI and mechan
istic models is necessary to tackle many current computational challenges
and enable the personalized design of new therapeutic interventions.About
the speaker:Dr. Marí\;a Rodrί\;guez Martί\;nez is the Techni
cal Leader of Systems Biology at IBM Research Europe (Switzerland) and an
associated member of the Department of Biology at ETH since 2014. A theore
tical physicist by training\, she became interested in the development of
computational and statistical approaches to unravel cancer molecular mecha
nisms using high-throughput multi-omics datasets and single-cell molecular
data. In recent years\, her team has specialized in the development of AI
approaches for personalized drug modeling. More recently\, she is buildin
g multi-scale models of the immune system through a combination of deep le
arning and mechanistic models. Through this effort\, her team has develope
d deep learning models to predict the specificity of T cell receptors and
stochastic mechanistic models to recapitulate B cell development.She is al
so quite active in the area of interpretable deep learning. Deep learning
has achieved astounding performances in a broad range of disciplines\, but
breakthrough performances have often come at the price of a lack of infor
mation about the rules that govern a model&rsquo\;s decision. Interpretabl
e deep learning aims to develop models that can not only make a prediction
with high accuracy\, but can also provide insight into the reasons underl
ying the prediction. On this area\, her team has contributed several novel
methods for different applications in computational biology\, ranging fro
m AI-driven protein modeling to the integration of image and RNA-Seq data
modalities.About the Center:The newly established Yale Center for Systems
and Engineering Immunology (CSEI) aims to bring together Yale faculty and
trainees from diverse departments across Yale University\, including the S
chool of Medicine\, School of Engineering and Applied Science\, and the Fa
culty of Arts and Sciences to deliberate and collaborate on systems\, quan
titative\, and synthetic immunology. The Center serves as an interdiscipli
nary home and meeting place for faculty\, researchers\, and students inter
ested in developing a quantitative\, predictive understanding of the immun
e system and in advancing technologies and computational approaches for sy
stematic engineering of synthetic immune molecules\, cells\, and systems t
o empower both basic understanding and biomedical applications. The Center
also aims to help enable computational\, data\, and technology intensive
studies involving the immune system\, including those that study the inter
actions between the immune system and the (patho)physiology of all organ s
ystems in health and disease. Towards these goals\, the Center is launchin
g a seminar and chalk talk series and will be recruiting new faculty and s
taff to complement existing strengths at Yale. The CSEI is supported by th
e Yale School of Medicine Dean&rsquo\;s office and the Department of Immun
obiology.Date: January 31\, 2023Time: 1:00 - 2:00 pmLocation: Brady Audito
rium (BML 131)\, 310 Cedar Street\n
CATEGORIES:Seminar Series,Special Seminar
END:VEVENT
BEGIN:VEVENT
UID:422@fds.yale.edu
DTSTART;TZID=America/New_York:20230201T160000
DTEND;TZID=America/New_York:20230201T170000
DTSTAMP:20240228T053626Z
URL:https://fds.yale.edu/?post_type=event&p=2081
SUMMARY:S&\;DS Seminar: Frederic Koehler (Stanford)
DESCRIPTION:Speaker: Frederic Koehler\, Postdoctoral Fellow\, Stanford Univ
ersity\n\n\n\nTowards the Statistically Principled Design of ML Algorithms
Abstract: What are the optimal algorithms for learning from data? Have we
found them already\, or are better ones out there to be discovered? Making
these questions precise\, and answering them\, requires taking on the mat
hematically deep interplay between statistical and computational considera
tions. It also requires reconciling our theoretical toolbox with surprisin
g new phenomena arising from practice\, which seem to violate conventional
rules of thumb regarding algorithm and model design. I will discuss progr
ess along these lines: in terms of designing new algorithms for basic lear
ning problems\, controlling generalization in large statistical models\, a
nd understanding statistical questions arising from generative modeling.Sp
eaker Bio: I am currently at Stanford University as a Motwani Postdoctoral
Fellow. Right before\, I was a research fellow in UC Berkeley's Simons In
stitute in the Program on Computational Complexity of Statistical Inferenc
e. I received my PHD in Mathematics and Statistics from MIT\, where I was
coadvised by Ankur Moitra and Elchanan Mossel\, and before that I received
my undergraduate degree in Mathematics at Princeton University. My curren
t research interests include computational learning theory and related top
ics: probability theory\, high-dimensional statistics\, optimization\, rel
ated aspects of statistical physics\, etc. In particular\, I am very inter
ested in learning and inference in graphical models.\nWednesday\, February
01\, 2023\n\n\n\n3:30pm - Pre-talk meet and greet teatime - Dana House\,
24 Hillhouse Ave.\n\n\n\n4:00PM to 5:00PM - Talk - Mason Lab 211\, 9 Hillh
ouse Ave\, New Haven\, CT\n\n\n\n\nwatch\n\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:420@fds.yale.edu
DTSTART;TZID=America/New_York:20230206T160000
DTEND;TZID=America/New_York:20230206T170000
DTSTAMP:20240227T174044Z
URL:https://fds.yale.edu/?post_type=event&p=2079
SUMMARY:S&\;DS Seminar: Raaz Dwivedi (Harvard &\; MIT)
DESCRIPTION:Speaker: Raaz Dwivedi\, FODSI postdoc fellow\, Computer Science
and Statistics\, Harvard University\; Electrical Engineering &\; Compu
ter Science\, MIT\n\n\n\n"From HeartSteps to HeartBeats: Personalized Deci
sion-making"Abstract: Ever-increasing access to data and computational pow
er allows us to make decisions that are personalized to users by taking th
eir behaviors and contexts into account. These developments are especially
useful in domains like mobile health and medicine. For effective personal
ized decision-making\, we need to revisit two fundamental tasks: (1) estim
ation and inference from data when there is no model for a decision&rsquo\
;s effect on a user and (2) simulations when there is a known model for a
decision&rsquo\;s effect on a user. Here we must overcome the difficulties
facing classical approaches\, namely statistical biases due to adaptively
collected data and computational bottlenecks caused by high-dimensional m
odels.This talk addresses both tasks. First\, I provide a nearest-neighbor
approach for unit-level statistical inference in sequential experiments.
I also introduce a doubly robust variant of nearest neighbors that provide
s sharp error guarantees and helps measure a mobile app&rsquo\;s effective
ness in promoting healthier lifestyle with limited data. For the second ta
sk\, I introduce kernel thinning\, a practical strategy that provides near
-optimal distribution compression in near-linear time. This method yields
significant computational savings when simulating models of cardiac functi
oning.Bio: Raaz Dwivedi is a FODSI postdoc fellow advised by Prof. Susan M
urphy and Prof. Devavrat Shah in CS and Statistics\, Harvard and EECS\, MI
T respectively. He earned his Ph. D. at EECS\, UC Berkeley\, advised by Pr
of. Martin Wainwright and Prof. Bin Yu\; and his bachelors degree at EE\,
IIT Bombay\, advised by Prof. Vivek Borkar. His research builds statistica
lly and computationally efficient strategies for personalized decision-mak
ing with theory and methods spanning the areas of causal inference\, reinf
orcement learning\, random sampling\, and high-dimensional statistics. He
won the President of India Gold Medal at IIT Bombay\, the Berkeley Fellows
hip\, teaching awards at UC Berkeley and Harvard\, and a best student pape
r award for his work on optimal compression.\n\n\nMonday\, February 6\, 20
23\n\n\n\n3:30pm - Pre-talk meet and greet teatime - Dana House\, 24 Hillh
ouse Avenue\n\n\n\n4:00pm - 5:00 pm - Talk - Mason Lab 211\, 9 Hillhouse A
venue\n\n\n\n\n\n\nWatch\n\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:421@fds.yale.edu
DTSTART;TZID=America/New_York:20230208T120000
DTEND;TZID=America/New_York:20230208T130000
DTSTAMP:20240226T190409Z
URL:https://fds.yale.edu/events/data-science-lit-search-nightmares-and-how
-to-avoid-them/
SUMMARY:Data Science Lit Search Nightmares (and how to avoid them)
DESCRIPTION:Have you ever heard horror stories about an embarrassing meetin
g when someone learned they'd missed half of the seminal research papers r
elated to their thesis? Do you have nightmares that someone else just publ
ished another version of your “groundbreaking” research? Let’s face
it\, we need to do so much research on who is doing what research that the
re’s no time left to do the research!\n\n\n\nJoin us for the first annua
l FDS/Marx Library collaboration lunch\, where you'll hear from librarians
about how to optimize your lit review workflows and set yourself up for s
uccess while freeing up time in the process. We will cover file management
strategies\, optimizing Zotero for LaTeX\, and how to pull PDF informatio
n from the web into Zotero. \n\n\n\nLunch included! Bring a friend! Meet
a librarian!\n\n\n\n17 Hillhouse Ave\, 3rd floor\n
CATEGORIES:Training
LOCATION:17 Hillhouse Ave\, 3rd floor\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=17 Hillhouse Ave\, 3rd floor:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:419@fds.yale.edu
DTSTART;TZID=America/New_York:20230208T160000
DTEND;TZID=America/New_York:20230208T170000
DTSTAMP:20240227T174044Z
URL:https://fds.yale.edu/?post_type=event&p=2078
SUMMARY:S&\;DS Seminar: Omar Montasser (TTI–Chicago)
DESCRIPTION:Speaker: Omar Montasser\, Toyota Technological Institute at Chi
cago\n\n\n\nWhat\, When\, and How can we Learn Adversarially Robustly?Abst
ract: Despite extraordinary progress\, current machine learning systems ha
ve been shown to be brittle against adversarial examples: seemingly innocu
ous but carefully crafted perturbations of test examples that cause machin
e learning predictors to misclassify. Can we learn predictors robust to ad
versarial examples? and how? There has been much empirical interest in thi
s major challenge in machine learning\, and in this talk\, we will present
a theoretical perspective. We will illustrate the need to go beyond tradi
tional approaches and principles\, such as empirical (robust) risk minimiz
ation\, and present new algorithmic ideas with stronger robust learning gu
arantees.Bio: Omar Montasser is a PhD candidate at TTI-Chicago advised by
Nathan Srebro. His research broadly explores the theory and foundations of
machine learning. Recently\, his research has focused on understanding an
d characterizing adversarially robust learning\, and on designing learning
algorithms with provable robustness guarantees under different settings.
His work has been recognized by a best student paper award at COLT (2019).
\nWednesday\, February 8\, 2023\n\n\n\n3:30pm - Pre-talk meet and greet te
atime - Dana House\, 24 Hillhouse Avenue\n\n\n\n4:00pm - 5:00 pm - Talk -
Mason Lab 211\, 9 Hillhouse Avenue\n\n\n\n\nwatch\n\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:418@fds.yale.edu
DTSTART;TZID=America/New_York:20230215T160000
DTEND;TZID=America/New_York:20230215T170000
DTSTAMP:20240227T174044Z
URL:https://fds.yale.edu/?post_type=event&p=2077
SUMMARY:S&\;DS Seminar: Sinho Chewi (MIT)
DESCRIPTION:Speaker: Sinho Chewi\, Mathematics\, Probability &\; Statist
ics\, MIT\n\n\n\nTowards a theory of complexity of sampling\, inspired by
optimizationAbstract: Sampling is a fundamental and widespread algorithmic
primitive that lies at the heart of Bayesian inference and scientific com
puting\, among other disciplines. Recent years have seen a flood of works
aimed at laying down the theoretical underpinnings of sampling\, in analog
y to the fruitful and widely used theory of convex optimization. In this t
alk\, I will discuss some of my work in this area\, focusing on new conver
gence guarantees obtained via a proximal algorithm for sampling\, as well
as a new framework for studying the complexity of non-log-concave sampling
.Bio: I am an Applied Mathematics PhD candidate at the Massachusetts Insti
tute of Technology (MIT)\, advised by Philippe Rigollet. I received my B.S
. in Engineering Mathematics and Statistics from University of California\
, Berkeley in 2018. In Fall 2021\, I participated in the Simons Institute
program on Geometric Methods in Optimization and Sampling and co-organized
(with Kevin Tian) a working group on the complexity of sampling. In Sprin
g 2022\, I visited Jonathan Niles-Weed at New York University (NYU). In Su
mmer 2022\, I was a research intern at Microsoft Research\, supervised by
Sé\;bastien Bubeck and Adil Salim\n\nWednesday\, February 15\, 2023\
n\n\n\n3:30pm - Pre-talk meet and greet teatime - Dana House\, 24 Hillhous
e Avenue\n\n\n\n4:00pm - 5:00 pm - Talk - Mason Lab 211\, 9 Hillhouse Aven
ue\n\n\n\n\nWatch\n\n\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:417@fds.yale.edu
DTSTART;TZID=America/New_York:20230220T160000
DTEND;TZID=America/New_York:20230220T170000
DTSTAMP:20240227T174044Z
URL:https://fds.yale.edu/?post_type=event&p=2076
SUMMARY:S&\;DS Seminar: Zhimei Ren (University of Chicago)
DESCRIPTION:Speaker: Zhimei Ren\, Postdoctoral Researcher in Statistics\, U
niversity of Chicago\n\n\n\n"Stable Variable Selection with Knockoffs"Abst
ract: A common problem in many modern statistical applications is to find
a set of important variables&mdash\;from a pool of many candidates&mdash\;
that explain the response of interest. For this task\, model-X knockoffs o
ffers a general framework that can leverage any feature importance measure
to produce a variable selection algorithm: it discovers true effects whil
e rigorously controlling the number or fraction of false positives\, pavin
g the way for reproducible scientific discoveries. The model-X knockoffs\,
however\, is a randomized procedure that relies on the one-time construct
ion of synthetic (random) variables. Different runs of model-X knockoffs o
n the same dataset often result in different sets of selected variables\,
which is not desirable for the reproducibility of the reported results.In
this talk\, I will introduce derandomization schemes that aggregate the se
lection results across multiple runs of the knockoffs algorithm to yield s
table selection. In the first part\, I will present a derandomization sche
me that controls the number of false positives\, i.e.\, the per family err
or rate (PFER) and the k family-wise error rate (k-FWER). In the second pa
rt\, I will talk about an alternative derandomization scheme with provable
false discovery rate (FDR) control. Equipped with these derandomization s
teps\, the knockoffs framework provides a powerful tool for making reprodu
cible scientific discoveries. The proposed methods are evaluated on both s
imulated and real data\, demonstrating comparable power and dramatically l
ower selection variability when compared with the original model-X knockof
fs.Bio: Zhimei Ren is a postdoctoral researcher in the Statistics Departme
nt at the University of Chicago\, advised by Professor Rina Foygel Barber.
Before joining the University of Chicago\, she obtained her Ph.D. in Stat
istics from Stanford University\, advised by Professor Emmanuel Cand&egrav
e\;s. Prior to this\, she received a Bachelor&rsquo\;s degree in Statistic
s from Peking University.Monday\, February 20\, 20233:30pm - Pre-talk meet
and greet teatime - Dana House\, 24 Hillhouse Avenue4:00pm - 5:00pm - Tal
k - Mason Lab 211\, 9 Hillhouse Avenue\nWatch\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:416@fds.yale.edu
DTSTART;TZID=America/New_York:20230222T160000
DTEND;TZID=America/New_York:20230222T170000
DTSTAMP:20240227T174044Z
URL:https://fds.yale.edu/?post_type=event&p=2075
SUMMARY:S&\;DS Seminar: Talin Wu (Stanford)
DESCRIPTION:Speaker: Talin Wu\, Postdoctoral Scholar in Computer Science\,
Stanford University\n\n\n\nLearning structured representations for acceler
ating scientific discovery and simulationAbstract: Across most disciplines
of science\, e.g.\, physics\, chemistry\, biomedicine\, materials\, mecha
nical engineering\, and energy\, a most critical challenge is that their s
imulations and discoveries are typically slow due to the large-scale\, com
plex and multi-scale nature of the system. In this talk\, I will introduce
my research that tackles this challenge by developing machine learning mo
dels with structured and efficient representations for accelerating scient
ific discovery and simulation. To accelerate scientific discovery\, I deve
loped neuro-symbolic methods which can distill the data into human-interpr
etable symbolic knowledge (governing equations and relational structures)
and generalize to more complex data in inference. To accelerate large-scal
e scientific simulations\, I developed structured representations to accel
erate critical scientific simulations for fluid dynamics\, plasma science\
, and generic partial differential equations (PDEs). For example\, I devel
oped a hybrid particle-fluid representation for simulating a large-scale l
aser-plasma interaction in a national lab facility that has important appl
ications in physics\, materials\, and biomedical science. Our model is abl
e to simulate millions of particles per time step\, orders of magnitude fa
ster than the classical solver\, and significantly reduce long-term predic
tion error compared to strong deep learning baselines.Bio: Tailin Wu is a
postdoctoral scholar in the Computer Science Department at Stanford Univer
sity\, working with Prof. Jure Leskovec. He received his Ph.D. from MIT Ph
ysics\, where his thesis focused on AI for Physics and Physics for AI. His
research interests include developing machine learning methods for large-
scale scientific simulations\, neuro-symbolic methods for scientific disco
very\, and representation learning\, using tools of graph neural networks\
, information theory\, and physics. His work has been published in top mac
hine learning conferences and leading physics journals\, and featured in M
IT Technology Review. He also serves as a reviewer for high-impact journal
s such as PNAS\, Nature Communications\, Nature Machine Intelligence\, and
Science Advances.\n\nWednesday\, February 22\, 2023\n\n\n\n3:30pm - Pre-t
alk meet and greet teatime - Dana House\, 24 Hillhouse Avenue\n\n\n\n4:00p
m - 5:00pm - Talk - Mason Lab 211\, 9 Hillhouse Avenue with the option of
virtual participation\n\n\n\n\nWatch\n\n\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:411@fds.yale.edu
DTSTART;TZID=America/New_York:20230223T103000
DTEND;TZID=America/New_York:20230223T113000
DTSTAMP:20240227T164205Z
URL:https://fds.yale.edu/?post_type=event&p=2070
SUMMARY:S&\;DS Seminar: Ilias Zadik (MIT)
DESCRIPTION:Speaker: Ilias Zadik\, MIT\n\n\n\nIn-Person seminars will be he
ld at Dunham Lab Room 220 with optional remote access:(https://yale.hosted
.panopto.com/Panopto/Pages/Sessions/List.aspx?folderID=f8b73c34-a27b-42a7-
a073-af2d00f90ffa)The price of computational efficiency in high-dimensiona
l estimationAbstract: In recent years we have experienced a remarkable gro
wth on the number and size of available datasets. Such growth has led to t
he intense and challenging pursuit of estimators which are provably both c
omputationally efficient and statistically accurate. Notably\, the analysi
s of polynomial-time estimators has revealed intriguing phenomena in sever
al high dimensional estimation tasks\, such as their apparent failure of s
uch estimators to reach the optimal statistical guarantees achieved among
all estimators (that is the presence of a non-trivial &ldquo\;computationa
l-statistical trade-off&rdquo\;).In this talk\, I will present new such al
gorithmic results for the well-studied planted clique model and for the fu
ndamental sparse regression model. For planted clique\, we reveal the surp
rising severe failure of the Metropolis process to work in polynomial-time
\, even when simple degree heuristics succeed. In particular\, our result
resolved a well-known 30-years old open problem on the performance of the
Metropolis process for the model\, posed by Jerrum in 1992. For sparse reg
ression\, we show the failure of large families of polynomial-time estimat
ors\, such as MCMC and low-degree polynomial methods\, to improve upon the
best-known polynomial-time regression methods. As an outcome\, our work o
ffers rigorous evidence that popular regression methods such as LASSO are
optimally balancing their computational and statistical recourses.Bio: My
research lies broadly in the interface of high dimensional statistics\, th
e theory of machine learning and computation\, and applied probability. A
lot of my work has the goal to build and use mathematical tools to bring i
nsights into the computational and statistical challenges of modern machin
e learning tasks. Website: https://iliaszadik.github.io/\n\nThursday\, Feb
ruary 23\, 2023\n\n\n\n10:30 am - 11:30 am - Talk - Dunham Lab\, Room 220\
, 10 Hillhouse Avenue\, 2nd Floor with the option of virtual participation
\n\n\n\n\nwatch\n\n\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:DL220\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=DL220:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:415@fds.yale.edu
DTSTART;TZID=America/New_York:20230227T160000
DTEND;TZID=America/New_York:20230227T170000
DTSTAMP:20240227T164205Z
URL:https://fds.yale.edu/?post_type=event&p=2074
SUMMARY:S&\;DS Seminar: Lu Lu (University of Pennsylvania)
DESCRIPTION:Speaker: Lu Lu\, Assistant Professor\, Department of Chemical a
nd Biomolecular Engineering\, University of Pennsylvania\n\n\n\nIn-Person
seminars will be held at Mason Lab 211 with optional remote access:(https:
//yale.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx?folderID=f8b73c
34-a27b-42a7-a073-af2d00f90ffa)Physics-informed deep learning: Blending da
ta and physics for learning functions and operatorsAbstract: Deep learning
has achieved remarkable success in diverse applications\; however\, its u
se in scientific applications has emerged only recently. In this talk\, I
will first review physics-informed neural networks (PINNs) and available e
xtensions for solving forward and inverse problems of partial differential
equations (PDEs). I will then introduce a less known but powerful result
that a NN can accurately approximate any nonlinear operator. This universa
l approximation theorem of operators is suggestive of the potential of NNs
in learning operators of complex systems. I will present the deep operato
r network (DeepONet) to learn various operators that represent determinist
ic and stochastic differential equations. I will demonstrate the effective
ness of DeepONet and its extensions to diverse multiphysics and multiscale
problems\, such as nanoscale heat transport\, bubble growth dynamics\, hi
gh-speed boundary layers\, electroconvection\, hypersonics\, and geologica
l carbon sequestration. Deep learning models are usually limited to interp
olation scenarios\, and I will quantify the extrapolation complexity and d
evelop a complete workflow to address the challenge of extrapolation for d
eep neural operators.Bio: I am an Assistant Professor in Department of Che
mical and Biomolecular Engineering at University of Pennsylvania. My curre
nt research interest is on scientific machine learning. My broad research
interests focus on multiscale modeling and high performance computing. Web
site: https://lululxvi.github.io/\n\nMonday\, February 27\, 2023\n\n\n\n3:
30pm - Pre-talk meet and greet teatime - Dana House\, 24 Hillhouse Avenue\
n\n\n\n4:00pm - 5:00 pm - Talk - Mason Lab 211\, 9 Hillhouse Avenue with t
he option of virtual participation\n\n\n\n\n\nwatch\n\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:414@fds.yale.edu
DTSTART;TZID=America/New_York:20230301T160000
DTEND;TZID=America/New_York:20230301T170000
DTSTAMP:20240227T164205Z
URL:https://fds.yale.edu/?post_type=event&p=2073
SUMMARY:S&\;DS Seminar: Oscar Leong (Caltech)
DESCRIPTION:Speaker: Oscar Leong\, Caltech\n\n\n\nIn-Person seminars will b
e held at Mason Lab 211 with optional remote access:(https://yale.hosted.p
anopto.com/Panopto/Pages/Sessions/List.aspx?folderID=f8b73c34-a27b-42a7-a0
73-af2d00f90ffa)The Power and Limitations of Convexity in Data ScienceAbst
ract: Optimization is a fundamental pillar of data science. Traditionally\
, the art and challenge in optimization lay primarily in problem formulati
on to ensure desirable properties such as convexity. In the context of con
temporary data science\, however\, optimization is practiced differently\,
with scalable local search methods applied to nonconvex objectives being
the dominant paradigm in high-dimensional problems. This has brought a num
ber of foundational mathematical challenges at the interface between optim
ization and data science pertaining to the dichotomy between convexity and
nonconvexity.In this talk\, I will discuss some of my work addressing the
se challenges in regularization\, a technique to encourage structure in so
lutions to statistical estimation and inverse problems. Even setting aside
computational considerations\, we currently lack a systematic understandi
ng from a modeling perspective of what types of geometries should be prefe
rred in a regularizer for a given data source. In particular\, given a dat
a distribution\, what is the optimal regularizer for such data and what ar
e the properties that govern whether it is amenable to convex regularizati
on? Using ideas from star geometry\, Brunn-Minkowski theory\, and variatio
nal analysis\, I show that we can characterize the optimal regularizer for
a given distribution and establish conditions under which this optimal re
gularizer is convex. Moreover\, I describe results establishing the robust
ness of our approach\, such as convergence of optimal regularizers with in
creasing sample size and statistical learning guarantees with applications
to several classes of regularizers of interest.Bio: I am a von Ká\;
rmá\;n Instructor at Caltech in the Computing + Mathematical Science
s department\, hosted by Venkat Chandrasekaran. I also work with Katie Bou
man and the Computational Cameras group. I completed my PhD from Rice Univ
ersity in Computational and Applied Mathematics under the supervision of P
aul Hand and was an NSF Graduate Research Fellow. I received my undergradu
ate degree in Mathematics from Swarthmore College.My research interests li
e in the mathematics of data science\, inverse problems\, machine learning
\, and optimization. Much of my work concerns solving signal recovery prob
lems with approaches inspired by deep learning and uses tools from high di
mensional probability\, random matrix theory\, and optimization to develop
provable recovery guarantees.\n\nWednesday\, March 01\, 2023\n\n\n\n3:30p
m - Pre-talk meet and greet teatime - Dana House\, 24 Hillhouse Avenue\n\n
\n\n4:00pm - 5:00 pm - Talk - Mason Lab 211\, 9 Hillhouse Avenue with the
option of virtual participation\n\n\n\n\n\nwatch\n\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:413@fds.yale.edu
DTSTART;TZID=America/New_York:20230306T160000
DTEND;TZID=America/New_York:20230306T170000
DTSTAMP:20240227T164205Z
URL:https://fds.yale.edu/?post_type=event&p=2072
SUMMARY:S&\;DS Seminar: Theodor Misiakiewicz (Stanford)
DESCRIPTION:Speaker: Theodor Misiakiewicz\, Stanford University\n\n\n\nIn-P
erson seminars will be held at Mason Lab 211 with optional remote access:(
https://yale.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx?folderID=
f8b73c34-a27b-42a7-a073-af2d00f90ffa)New Statistical and Computational Phe
nomena From Deep LearningAbstract: Deep learning methodology has presented
major challenges for statistical learning theory. Indeed deep neural netw
orks often operate in regimes outside the realm of classical statistics an
d optimization wisdom. In this talk\, we will consider two illustrative ex
amples which clarify some of these new challenges. The first example consi
ders an instance where kernel ridge regression with a simple RBF kernel ac
hieves optimal test error when it perfectly fits the noisy training data.
Why can we interpolate noisy data and still generalize well? Why can overf
itting be benign in kernel ridge regression? The second example&mdash\;com
putational in nature&mdash\;considers fitting two different smooth ridge f
unctions with deep neural networks (DNNs). Both can be estimated at the sa
me near-parametric rate by DNNs trained with unbounded computational resou
rces. However\, empirically\, learning becomes much harder for one of thes
e functions when restricted to DNNs trained using SGD. Why does SGD succee
d on some functions and fail on others? The goal of this talk will be to u
nderstand these two simulations. In particular\, we will demonstrate quant
itative theories that can precisely capture both phenomena.Bio: My interes
t lies broadly at the intersection of statistics\, machine learning\, prob
ability and computer science. Lately\, I have been focusing on the statist
ical and computational aspects of deep learning\, and the performance of k
ernel and random feature methods in high dimension. Some of the questions
I am currently interested in: When can we expect neural networks to outper
form kernel methods? When can neural networks beat the curse of dimensiona
lity? On the other hand\, what are the computational limits of gradient-tr
ained neural networks? What structures in real data allows for efficient l
earning? When is overfitting benign? How much overparametrization is optim
al? When can we expect universal or non-universal behavior in empirical ri
sk minimization? Website: https://misiakie.github.io/\n\nMonday\, March 06
\, 2023\n\n\n\n3:30pm - Pre-talk meet and greet teatime - Dana House\, 24
Hillhouse Avenue\n\n\n\n4:00pm - 5:00 pm - Talk - Mason Lab 211\, 9 Hillho
use Avenue with the option of virtual participation\n\n\n\n\n\nwatch\n\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:412@fds.yale.edu
DTSTART;TZID=America/New_York:20230308T160000
DTEND;TZID=America/New_York:20230308T170000
DTSTAMP:20240227T164205Z
URL:https://fds.yale.edu/?post_type=event&p=2071
SUMMARY:S&\;DS Seminar: Masatoshi Uehara (Cornell)
DESCRIPTION:Speaker: Masatoshi Uehara\, Cornell University\n\n\n\nIn-Person
seminars will be held at Mason Lab 211 with optional remote access:(https
://yale.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx?folderID=f8b73
c34-a27b-42a7-a073-af2d00f90ffa)Talk Title TBAAbstract: To be announcedBio
: My interests are reinforcement learning\, causal inference\, and online
learning. These days\, I am working on the application of RL to medicine a
nd social sciences. Website: https://www.masatoshiuehara.com/Wednesday\, M
arch 08\, 20233:30pm - Pre-talk meet and greet teatime - Dana House\, 24 H
illhouse Avenue4:00pm - 5:00 pm - Talk - Mason Lab 211\, 9 Hillhouse Avenu
e with the option of virtual participation\nwatch\n\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:405@fds.yale.edu
DTSTART;TZID=America/New_York:20230327T160000
DTEND;TZID=America/New_York:20230327T170000
DTSTAMP:20240226T190404Z
URL:https://fds.yale.edu/events/sds-colloquium-nadav-cohen-tel-aviv-univer
sity-what-makes-data-suitable-for-deep-learning/
SUMMARY:S&\;DS Colloquium: Nadav Cohen (Tel Aviv University) "What Makes
Data Suitable for Deep Learning?"
DESCRIPTION:Deep learning is delivering unprecedented performance when appl
ied to various data modalities\, yet there are data distributions over whi
ch it utterly fails. The question of what makes a data distribution suitab
le for deep learning is a fundamental open problem in the field. In this
talk I will present a recent theory aiming to address the problem via too
ls from quantum physics. The theory establishes that certain neural netw
orks are capable of accurate prediction over a data distribution if and on
ly if the data distribution admits low quantum entanglement under certain
partitions of features. This brings forth practical methods for adaptati
on of data to neural networks\, and vice versa. Experiments with widespr
ead models over various datasets will demonstrate the findings. An under
lying theme of the talk will be the potential of physics to advance our un
derstanding of the relation between deep learning and real-world data.\n\n
\n\nWorks covered in the talk were in collaboration with my graduate stude
nts Noam Razin\, Yotam Alexander\, Nimrod De La Vega and Tom Verbin.\n\n\n
\nBio: Nadav Cohen is an Asst. Prof. of Computer Science at Tel Aviv Unive
rsity. His research focuses on the theoretical and algorithmic foundatio
ns of deep learning. He earned a BSc in electrical engineering and a BSc
in mathematics (both summa cum laude) at the Technion Excellence Program
for Distinguished Undergraduates\, followed by a PhD (direct track) in com
puter science at the Hebrew University of Jerusalem. Subsequently\, he w
as a postdoctoral research scholar at the Institute for Advanced Study in
Princeton. For his contributions to deep learning\, Nadav received a num
ber of awards\, including the Google Doctoral Fellowship in Machine Learni
ng\, the Rothschild Postdoctoral Fellowship\, the Zuckerman Postdoctoral F
ellowship\, and the Google Research Scholar Award.\n\n\n\nIn-Person semina
rs will be held at Mason Lab 211\, 9 Hillhouse Avenue with the option of v
irtual participation: https://yale.hosted.panopto.com/Panopto/Pages/Viewe
r.aspx?id=53e6ca36-44bf-4760-83fc-af93011fd562\n\n\n\n3:30pm - Pre-talk
meet and greet teatime - Dana House\, 24 Hillhouse Avenue\n
CATEGORIES:Colloquium,Seminar Series,Statistics & Data Science Seminar
END:VEVENT
BEGIN:VEVENT
UID:406@fds.yale.edu
DTSTART;TZID=America/New_York:20230329T160000
DTEND;TZID=America/New_York:20230329T170000
DTSTAMP:20240226T190405Z
URL:https://fds.yale.edu/events/fds-colloquium-nathan-srebro-ttic-interpol
ation-learning-and-overfitting-with-linear-predictors-and-short-programs/
SUMMARY:FDS Colloquium: Nathan Srebro (TTIC) “Interpolation Learning and
Overfitting with Linear Predictors and Short Programs”
DESCRIPTION:"Interpolation Learning and Overfitting with Linear Predictors
and Short Programs"\n\n\n\nLocation: Mason 211 or remote access: https://
yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=7e9e0891-7848-44ad-91
e7-af93011fd580 \n\n\n\n\n\n\n\nSpeaker: Nathan SrebroProfessor\, Toyota T
echnological Institute at Chicago\n\n\n\nAbstract: Classical theory\, conv
entional wisdom\, and all textbooks\, tell us to avoid reaching zero train
ing error and overfitting the noise\, and instead balance model fit and co
mplexity. Yet\, recent empirical and theoretical results suggest that in
many cases overfitting is benign\, and even interpolating the training da
ta can lead to good generalization. Can we characterize and understand w
hen overfitting is indeed benign\, and when it is catastrophic as classic
theory suggests? And can existing theoretical approaches be used to st
udy and explain benign overfitting and the "double descent" curve? I wil
l discuss interpolation learning in linear (and kernel) methods\, as well
as using the universal "minimum description length" or "shortest program"
learning rule.\n\n\n\n\n\n\n\nBio: Nati (Nathan) Srebro is a professor at
the Toyota Technological Institute at Chicago\, with cross-appointments a
t the University of Chicago's Department of Computer Science\, and Committ
ee on Computational and Applied Mathematics. He obtained his PhD from the
Massachusetts Institute of Technology in 2004\, and previously was a postd
octoral fellow at the University of Toronto\, a visiting scientist at IBM\
, and an associate professor at the Technion. \n\n\n\nDr. Srebro’s re
search encompasses methodological\, statistical and computational aspects
of machine learning\, as well as related problems in optimization. Some of
Srebro’s significant contributions include work on learning “wider”
Markov networks\, introducing the use of the nuclear norm for machine lea
rning and matrix reconstruction\, work on fast optimization techniques for
machine learning\, and on the relationship between learning and optimizat
ion. His current interests include understanding deep learning through a d
etailed understanding of optimization\, distributed and federated learning
\, algorithmic fairness and practical adaptive data analysis.\n
CATEGORIES:Colloquium,Seminar Series
END:VEVENT
BEGIN:VEVENT
UID:404@fds.yale.edu
DTSTART;TZID=America/New_York:20230331T120000
DTEND;TZID=America/New_York:20230331T130000
DTSTAMP:20240226T190404Z
URL:https://fds.yale.edu/events/fds-colloquium-tara-javidi-ucsd-a-conseque
ntial-view-of-information-for-statistical-learning-and-optimization/
SUMMARY:FDS Colloquium: Tara Javidi (UCSD) "A (Con)Sequential View of Infor
mation for Statistical Learning and Optimization"
DESCRIPTION:A (Con)Sequential View of Information for Statistical Learning
and Optimization\n\n\n\nSpeaker: Tara JavidiJacobs Family Scholar and Prof
essorElectrical and Computer EngineeringUCSD\n\n\n\nAbstract: In most comm
unication systems\, adapting transmission strategies to the (unpredictable
) realization of channel output at the receiver requires an (unrealistic)
assumption about the availability of a reliable “feedback” channel. Th
is unfortunate fact\, combined by the historical linkage between teaching
information theory and digital communication curriculum has kept “feedba
ck information theory” less taught\, discussed\, appreciated and underst
ood compared to other topics in our field.\n\n\n\nThis talk\, in contrast\
, highlights important and challenging problems in machine learning\, opti
mization\, statistics\, and control theory\, where the problem of acquirin
g information in an adaptive manner arises very naturally. Thus\, I will a
rgue that an increased emphasis on (teaching) feedback information theory
can provide vast and exciting research opportunities at the intersection o
f information theory and these fields. In particular\, I will revisit simp
le-to-teach results in feedback information theory including sequential hy
pothesis testing\, arithmetic coding\, successive refinement\, noisy binar
y search\, and posterior matching. Drawing on my own research\, I will als
o highlight the successful application of these sequential techniques in a
variety of problem instances such as black-box optimization\, distributio
n estimation\, and active machine learning with imperfect labels.\n\n\n\nS
peaker bio: Tara Javidi received her BS in electrical engineering at Shar
if University of Technology\, Tehran\, Iran. She received her MS degrees
in electrical engineering (systems) and in applied mathematics (stochasti
c analysis) from the University of Michigan\, Ann Arbor as well as her Ph
.D. in electrical engineering and computer science in 2002. She is curren
tly a Jacobs Family Scholar and Professor of Electrical and Computer Engi
neering and a founding co-director of the Center for Machine-Intelligence
\, Computing and Security (MICS) at UCSD.\n\n\n\nTara Javidi’s research
interests are in theory of active learning\, information acquisition and
statistical inference\, information theory with feedback\, stochastic
control theory\, and wireless networks. \n\n\n\nLocation: In-person at YI
NS\, 17 Hillhouse Ave\, 3rd floor. Yale-only livestream: https://yale.host
ed.panopto.com/Panopto/Pages/Viewer.aspx?id=accec6b8-cece-4306-869b-afce01
58dceb \n\n\n\nLunch will be served.\n
CATEGORIES:Colloquium,Seminar Series
END:VEVENT
BEGIN:VEVENT
UID:403@fds.yale.edu
DTSTART;TZID=America/New_York:20230331T150000
DTEND;TZID=America/New_York:20230331T160000
DTSTAMP:20240226T190404Z
URL:https://fds.yale.edu/events/fds-faculty-showcase/
SUMMARY:FDS Faculty Showcase
DESCRIPTION:Location: YINS\, 17 Hillhouse Avenue\, 3rd floor. Streaming ava
ilable to the Yale community only: https://yale.hosted.panopto.com/Panopto
/Pages/Viewer.aspx?id=81b6f8fe-5edd-4813-8f04-afcf00d6f70d \n\n\n\nWe invi
te you to join us at the Yale Institute for Foundations of Data Science (F
DS) Faculty Showcase on March 31st at 3:00 PM. Ten distinguished Yale facu
lty members will present their research and insights\, including Andre Wib
isono\, Rex Ying\, Brian Macdonald\, Ethan Meyers\, Leying Guan\, Jason Sh
aw\, Yihong Wu\, Lucila Ohno-Machado\, Zhuoran Yang\, Casey King and Madih
a Tahir. Each speaker will have just five minutes to tantalize the communi
ty and stimulate future conversation and collaboration. Refreshments will
be provided. This is a wonderful opportunity to learn about these esteemed
faculty members.\n\n\n\n\n\n\n\nSpeakers:\n\n\n\nAndre Wibisono\n\n\n\nRe
x Ying\n\n\n\nBrian Macdonald\n\n\n\nEthan Meyers\n\n\n\nLeying Guan\n\n\n
\nJason Shaw\n\n\n\nYihong Wu\n\n\n\nLucila Ohno-Machado\n\n\n\nZhuoran Ya
ng\n\n\n\nCasey King\n\n\n\nMadiha Tahir\n\n\n\n\n\n\n\n\n
CATEGORIES:Special Seminar
END:VEVENT
BEGIN:VEVENT
UID:402@fds.yale.edu
DTSTART;TZID=America/New_York:20230403T143000
DTEND;TZID=America/New_York:20230403T153000
DTSTAMP:20240227T154008Z
URL:https://fds.yale.edu/?post_type=event&p=2061
SUMMARY:Gerstein Lab: Diego Garrido (Univ of Barcelona) “A multivariate a
pproach to study the genetic determinants of phenotypic traits”
DESCRIPTION:Diego Garrido-Martí\;n\, PhD\n\n\n\nSpeaker: Diego Garrid
o-Martí\;n\, PhDAssistant ProfessorDepartment of Genetics\, Microbio
logy and StatisticsUniversity of Barcelona (Spain)Title: \; \;&nbs
p\; \; \; \; \; \; \; \; \; \; \;&
nbsp\; \;&ldquo\;A multivariate approach to study the genetic determin
ants of phenotypic traits&rdquo\;Date: \; \; \; \; \;&
nbsp\; \; \; \; \; \; \; \; \; \;Monda
y\, April 3rd \;2023Time: \; \; \; \; \; \;&nb
sp\; \; \; \; \; \; \; \;2:30 &ndash\; 3.30 PM
Place: \; \; \; \; \; \; \; \; \;
\; \; \; \; \;Yale Science Building\, Room 352 \;Host:
\; \; \; \; \; \; \; \; \; \;&nbs
p\; \; \; \; \;Mark GersteinAbstract: The increasing avail
ability of phenotypic data at multiple levels &ndash\; from the organismal
to the molecular &ndash\; in large cohorts of genotyped individuals enabl
es genetic association studies (GWAS\, molecular QTL mapping). These studi
es often test association with genetic variants using a single trait at a
time\, even though many biological phenotypes are intrinsically multi-trai
t: size and connectivity of brain regions\, levels of blood lipids\, facia
l and allometric traits\, composition of the gut microbiota\, abundances o
f alternative splicing isoforms\, single-cell gene expression across cell
types\, or even automatically learnt features from histological images via
deep convolutional autoencoder networks. Because of the correlated struct
ure of these traits\, joint (multivariate) analysis often results in incre
ased statistical power to detect genetic associations\, even when only a s
mall fraction of the traits is affected by the genetic variants tested. Ho
wever\, commonly used multivariate methods either lack interpretability\,
tend to make strong assumptions on the distribution of the traits of inter
est or do not scale well to the size of current datasets. In this context\
, PERMANOVA offers a powerful non-parametric approach. However\, it relies
on permutations to assess significance\, which hinders the analysis of la
rge datasets. Here\, we derive the limiting null distribution of the PERMA
NOVA test statistic\, providing a framework for the fast computation of as
ymptotic p-values. We show that the asymptotic test presents controlled ty
pe I error and high power\, comparable to or higher than parametric approa
ches. We illustrate the applicability of our method in a number of use-cas
es. Using the GTEx cohort\, we perform the first population-biased splicin
g QTL mapping study across multiple tissues. We identify thousands of gene
tic variants that affect alternative splicing differently depending on eth
nicity\, including potential disease markers. Using the UK Biobank cohort\
, we perform the largest GWAS to date of MRI-derived volumes of hippocampa
l subfields. Most of the identified loci have not been previously related
to the hippocampus\, but many are associated to cognition or brain disorde
rs\, thus contributing to understand the intermediate traits through which
genetic variants impact complex organismal phenotypes.\n
CATEGORIES:Seminar Series
END:VEVENT
BEGIN:VEVENT
UID:400@fds.yale.edu
DTSTART;TZID=America/New_York:20230403T160000
DTEND;TZID=America/New_York:20230403T170000
DTSTAMP:20240226T190403Z
URL:https://fds.yale.edu/events/sds-seminar-sebastian-pokutta-tu-berlin-co
nditional-gradients-in-machine-learning/
SUMMARY:S&\;DS Seminar: Sebastian Pokutta (TU Berlin)\, "Conditional Gra
dients in Machine Learning"
DESCRIPTION:"Conditional Gradients in Machine Learning" \n\n\n\nSpeaker: Se
bastian Pokutta (TU Berlin)\n\n\n\nMonday\, April 03\, 2023\, 4:00PM to 5:
00PM \n\n\n\n3:30pm - Pre-talk meet and greet teatime - Dana House\, 24 Hi
llhouse Avenue\n\n\n\nLocation: Mason Lab\, Rm. 211\, 9 Hillhouse Avenue N
ew Haven\, CT 06511 or via Panopto\n\n\n\nAbstract: Conditional Gradient m
ethods are an important class of methods to minimize (non-)smooth convex f
unctions over (combinatorial) polytopes. Recently these methods received a
lot of attention as they allow for structured optimization and hence lear
ning\, incorporating the underlying polyhedral structure into solutions. I
n this talk I will give a broad overview of these methods\, their applicat
ions\, as well as present some recent results both in traditional optimiza
tion and learning as well as in deep learning. \n\n\n\nSpeaker Bio: Sebast
ian Pokutta is the Vice President of the Zuse Institute Berlin (ZIB) and a
Professor of Mathematics at TU Berlin with a research focus on Artificial
Intelligence and Optimization. Having received both his diploma and Ph.D.
in mathematics from the University of Duisburg-Essen in Germany\, Pokutta
was a postdoctoral researcher and visiting lecturer at MIT\, worked for I
BM ILOG\, and Krall Demmel Baumgarten. Prior to joining ZIB and TU Berlin\
, he was the David M. McKenney Family Associate Professor in the School of
Industrial and Systems Engineering and an Associate Director of the Machi
ne Learning @ GT Center at the Georgia Institute of Technology as well as
a Professor at the University of Erlangen-Nürnberg. Sebastian received th
e David M. McKenney Family Early Career Professorship in 2016\, an NSF CAR
EER Award in 2015\, the Coca-Cola Early Career Professorship in 2014\, the
outstanding thesis award of the University of Duisburg-Essen in 2006\, as
well as various Best Paper awards. \n\n\n\nPokutta’s research is situat
ed at the intersection of Artificial Intelligence and Optimization\, combi
ning Machine Learning with Discrete Optimization techniques as well as the
Theory of Extended Formulations\, exploring the limits of computation in
alternative models of complexity. A particular focus are so-called Frank-W
olfe methods and conditional gradient methods due to their versatility in
the context of constrained optimization and structured learning. Pokutta h
as also worked on applications of Optimization and Machine Learning\, leve
raging data in the context of pressing industrial and financial challenges
. These areas include Supply Chain Management\, Manufacturing\, Cyber-Phys
ical Systems (incl. Industrial Internet\, Industry 4.0\, Internet of Thing
s)\, and Finance. Examples of Pokutta’s applied work include stowage opt
imization problems for inland vessels\, oil production problems\, clearing
of electricity markets\, order fulfillment problems\, warehouse location
problems\, simulation of autonomous vehicle fleets\, portfolio optimizatio
n problems\, optimal liquidity management strategies\, and predictive preg
nancy diagnostics. \n\n\n\n3:30pm - Pre-talk meet and greet teatime - Dana
House\, 24 Hillhouse Avenue\n
CATEGORIES:Seminar Series,Statistics & Data Science Seminar
END:VEVENT
BEGIN:VEVENT
UID:401@fds.yale.edu
DTSTART;TZID=America/New_York:20230404T120000
DTEND;TZID=America/New_York:20230404T130000
DTSTAMP:20240226T190403Z
URL:https://fds.yale.edu/events/critical-visualizations-rethinking-represe
ntations-of-data/
SUMMARY:Critical Visualizations: Rethinking representations of data
DESCRIPTION:Speaker: Peter A. HallReader in Graphic Design at CCW\, Univers
ity of the Arts London\, UK\n\n\n\nLocation: 17 Hillhouse Avenue\, 3rd flo
or\n\n\n\nAbstract: Information may be beautiful\, but our decisions about
the data we choose to represent and how we represent it are never neutral
. This insightful history traces how data visualization accompanied modern
technologies of war\, colonialism and the management of social issues of
poverty\, health and crime. Discussion is based around examples of visuali
zation\, from the ancient Andean information technology of the quipu to
contemporary projects that show the fate of our rubbish and take a partici
patory approach to visualizing cities. This analysis places visualization
in its theoretical and cultural contexts\, and provides a critical framewo
rk for understanding the history of information design with new directions
for contemporary practice.\n\n\n\nSpeaker bio: Peter A. Hall is Reader in
Graphic Design at CCW\, University of the Arts London\, UK. His publicati
ons include Critical Visualization: Rethinking the Representation of Data\
, co-authored with Patricio Dávila (Bloomsbury\, 2022)\, Sagmeister: Made
You Look (2009)\, Else/Where: Mapping - New Cartographies of Networks and
Territories\, co-edited with Janet Abrams (2005) and Tibor Kalman: Perver
se Optimist (2002).\n\n\n\nFor more information about the book please visi
t here\n
CATEGORIES:Seminar Series,Special Seminar
END:VEVENT
BEGIN:VEVENT
UID:409@fds.yale.edu
DTSTART;TZID=America/New_York:20230405T160000
DTEND;TZID=America/New_York:20230405T170000
DTSTAMP:20240227T154008Z
URL:https://fds.yale.edu/?post_type=event&p=2068
SUMMARY:FDS &\;amp\; Econometrics Talk: Markus Pelger (Stanford) "Stripp
ing the Discount Curve – a Robust Machine Learning Approach"
DESCRIPTION:Speaker: Markus Pelger\, Stanford\n\n\n\nIn-Person seminars wil
l be held at Mason Lab 211 with optional remote access:(https://yale.hoste
d.panopto.com/Panopto/Pages/Sessions/List.aspx?folderID=f8b73c34-a27b-42a7
-a073-af2d00f90ffa)Joint talk hosted with Xiaohong Chen and Edward Vytlaci
l from the Department of EconomicsAbstract: The yield curve of U.S. Treasu
ry securities is one of the most fundamental economic quantities and criti
cal datasets for researchers and practitioners. The yield curve or\, equiv
alently\, discount curve is a key factor for economists\, traders\, asset
managers\, central banks\, and financial-markets regulators. Precise and r
obust yield estimates are needed for trading and making investment decisio
ns\, studying the term structure\, predicting bond returns\, analyzing mon
etary policy\, and pricing assets\, derivatives and liabilities. We introd
uce a robust\, flexible and easy-to-implement method for estimating the yi
eld curve from the sparse set of noisy Treasury securities. Our non-parame
tric estimator can explain complex yield curve shapes. We trade off pricin
g errors against an economically motivated smoothness reward of the discou
nt curve. This uniquely determines the optimal basis functions that span t
he discount curve in a reproducing kernel Hilbert space. We show that most
existing models for estimating the discount curve are nested within our g
eneral framework by imposing additional ad-hoc assumptions. We provide a c
losed-form solution of our machine learning estimator as a simple kernel r
idge regression\, which is straightforward to implement. We show in an ext
ensive empirical study on U.S. Treasury securities\, that our method stron
gly dominates all parametric and non-parametric benchmarks. It achieves su
bstantially smaller out-of-sample yield and pricing errors\, while being r
obust to outliers and data selection choices. We attribute the superior pe
rformance to the optimal trade-off between flexibility and smoothness\, wh
ich positions our method as the new standard for yield curve estimation. W
e provide a publicly available and regularly updated new benchmark dataset
for daily zero-coupon Treasury yields based on our estimates. Our benchma
rk dataset provides the most precise zero-coupon Treasury yield estimates
for all maturity ranges\, while being robust to data selection choices.Bio
: Markus Pelger is an Assistant Professor of Management Science &\; Eng
ineering at Stanford University and a Reid and Polly Anderson Faculty Fell
ow.His research focuses on understanding and managing financial risk. He d
evelops mathematical financial models and statistical methods\, analyzes f
inancial data and engineers computational techniques. His research is divi
ded into three streams: statistical learning in high-dimensional financial
data sets\, stochastic financial modeling\, and high-frequency statistics
. His most recent work focuses on developing machine learning solutions to
big-data problems in empirical asset pricing.Markus' work has appeared in
the Journal of Finance\, Review of Financial Studies\, Management Science
\, Journal of Econometrics and Journal of Applied Probability. He is an As
sociate Editor of Management Science\, Digital Finance and Data Science in
Science. His research has been recognized with several awards\, including
the Utah Winter Finance Conference Best Paper Award\, the Best Paper in A
sset Pricing Award at the SFS Cavalcade\, the Dennis Aigner Award of the J
ournal of Econometrics\, the International Center for Pension Management R
esearch Award\, the CAFM Best Paper Award and the IQAM Research Award. He
has been invited to speak at hundreds of world-renowned universities\, con
ferences and investment and technology firms.Markus received his Ph.D. in
Economics from the University of California\, Berkeley. He has two Diploma
s in Mathematics and in Economics\, both with highest distinction\, from t
he University of Bonn in Germany. He is a scholar of the German National M
erit Foundation and he was awarded a Fulbright Scholarship\, the Institute
for New Economic Thinking Prize\, the Eliot J. Swan Prize and the Graduat
e Teaching Award at Stanford University. Markus is a founding organizer of
the AI &\; Big Data in Finance Research Forum and the Advanced Financi
al Technology Laboratories.Website: https://mpelger.people.stanford.edu/We
dnesday\, April 5\, 20233:30pm &ndash\; Pre-talk meet and greet teatime &n
dash\; Dana House\, 24 Hillhouse Avenue4:00pm &ndash\; 5:00 pm &ndash\; Ta
lk &ndash\; Mason Lab 211\, 9 Hillhouse Avenue with the option of virtual
participation\nWatch\n\n
CATEGORIES:Seminar Series
END:VEVENT
BEGIN:VEVENT
UID:392@fds.yale.edu
DTSTART;TZID=America/New_York:20230417T160000
DTEND;TZID=America/New_York:20230417T170000
DTSTAMP:20240226T190401Z
URL:https://fds.yale.edu/events/fds-colloquium-dan-yamins-a-fruitful-recip
rocity-the-neuroscience-ai-connection/
SUMMARY:FDS Colloquium: Dan Yamins "A Fruitful Reciprocity: The Neuroscienc
e–AI Connection
DESCRIPTION:Speaker: Dan YaminsAssistant Professor of Psychology and Comput
er ScienceStanford University\n\n\n\nHosted by: John Lafferty\n\n\n\nIn-pe
rson event with remote access option via Panopto\n\n\n\nA Fruitful Recipro
city: The Neuroscience-AI Connection\n\n\n\nAbstract: The emerging field o
f NeuroAI has leveraged techniques from artificial intelligence to analyze
large-scale brain data. In this talk\, I will show that the connection be
tween neuroscience and AI can be fruitful in both directions. Towards “A
I driving neuroscience”\, I will discuss recent advances in self-supervi
sed learning with deep recurrent networks that yield a developmentally-pla
usible model of the primate visual system. In the direction of “neurosci
ence guiding AI”\, I will present a novel cognitively-grounded computati
onal theory of perception that generates powerful new learning algorithms
for real-world scene understanding. Taken together\, these ideas illustrat
e how neural networks optimized to solve cognitively-informed tasks provid
e a unified framework for both understanding the brain and improving AI.\n
\n\n\nBio: Dan Yamins is a computational neuroscientist at Stanford Univer
sity\, where he's an assistant professor of Psychology and Computer Scienc
e\, and a faculty scholar at the Wu Tsai Neurosciences Institute. Dan work
s on science and technology challenges at the intersection of neuroscience
\, artificial intelligence\, psychology and large-scale data analysis.\n\n
\n\nThe brain is the embodiment of the most beautiful algorithms ever writ
ten. His research group\, the Stanford NeuroAILab\, seeks to "reverse engi
neer" these algorithms\, both to learn both about how our minds work and b
uild more effective artificial intelligence systems. Website: http://stanf
ord.edu/~yamins/\n\n\n\nMonday\, April 17\, 2023 \n\n\n\n3:30pm – Pre-ta
lk meet and greet teatime – Dana House\, 24 Hillhouse Avenue\n\n\n\n4:00
- 5:00 pm - Talk - In-Person seminars will be held at Mason Lab 211 with
virtual participation (on campus only):(https://yale.hosted.panopto.com/Pa
nopto/Pages/Sessions/List.aspx?folderID=f8b73c34-a27b-42a7-a073-af2d00f90f
fa)\n
CATEGORIES:Colloquium,Seminar Series
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:393@fds.yale.edu
DTSTART;TZID=America/New_York:20230420T160000
DTEND;TZID=America/New_York:20230420T170000
DTSTAMP:20240226T190401Z
URL:https://fds.yale.edu/events/fds-colloquium-philippe-rigollet-mit-stati
stical-applications-of-wasserstein-gradient-flows/
SUMMARY:FDS Colloquium: Philippe Rigollet (MIT) "Statistical applications o
f Wasserstein gradient flows"
DESCRIPTION:Speaker: Philippe Rigollet\, PhDProfessor of MathematicsMassac
husetts Institute of Technology\n\n\n\nHosted by Yihong Wu\n\n\n\nIn-pers
on event with remote access option via Panopto\n\n\n\nStatistical applica
tions of Wasserstein gradient flows\n\n\n\nAbstract: Otto calculus is a fu
ndamental toolbox in mathematical optimal transport\, imparting the Wasser
stein space of probability measures with a Riemmanian structure. In partic
ular\, one can compute the Riemannian gradient of a functional over this s
pace and\, in turn\, optimize it using Wasserstein gradient flows. The nec
essary background to define and compute Wasserstein gradient flows will be
presented in the first part of the talk before moving to several statisti
cal applications ranging from variational inference to maximum likelihood
estimation in Gaussian mixture models. Emphasis will be placed on conceptu
al ideas in order for the talk to be accessible to a broad audience.\n\n\n
\nBio: Philippe Rigollet works at the intersection of statistics\, machine
learning\, and optimization\, focusing primarily on the design and analys
is of statistical methods for high-dimensional problems. His recent resear
ch focuses on statistical optimal transport and its applications to geomet
ric data analysis and sampling. Website: www-math.mit.edu/~rigollet\n\n\n\
nThursday\, April 20\, 2023\n\n\n\n3:30pm – Pre-talk meet and greet teat
ime – Dana House\, 24 Hillhouse Avenue\n\n\n\n4:00 – 5:00pm – Talk
– This in-person seminar will be held at 17 Hillhouse\, 3rd Floor Common
Area with virtual participation https://yale.hosted.panopto.com/Panopto/P
ages/Viewer.aspx?id=7219ac1f-3d1b-458c-86d7-afe9010e4e65\n\n\n\n\n\n\n\n\n
WEBCAST\n\n
CATEGORIES:Colloquium,Seminar Series
LOCATION:17 Hillhouse Ave\, 3rd floor\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=17 Hillhouse Ave\, 3rd floor:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:408@fds.yale.edu
DTSTART;TZID=America/New_York:20230424T160000
DTEND;TZID=America/New_York:20230424T170000
DTSTAMP:20240226T190405Z
URL:https://fds.yale.edu/events/fds-colloquium-robert-schapire-microsoft-r
esearch-convex-analysis-at-infinity-an-introduction-to-astral-space/
SUMMARY:FDS Colloquium: Robert Schapire (Microsoft Research) "Convex Analys
is at Infinity: An Introduction to Astral Space"
DESCRIPTION:Speaker: Robert SchapireComputer Scientist\,Microsoft Research
(NYC Lab)\n\n\n\nHosted by: Dan Spielman\n\n\n\nIn person event with remot
e access via Panopto.\n\n\n\nAbstract: Not all convex functions have finit
e minimizers\; some can only be minimized by a sequence as it heads to inf
inity. In this work\, we aim to develop a theory for understanding such
minimizers at infinity. We study astral space\, a compact extension of
Euclidean space to which such points at infinity have been added. Astral
space is constructed to be as small as possible while still ensuring that
all linear functions can be continuously extended to the new space. Alt
hough not a vector space\, nor even a metric space\, astral space is never
theless so well-structured as to allow useful and meaningful extensions of
such concepts as convexity\, conjugacy\, and subdifferentials. We devel
op these concepts and analyze various properties of convex functions on as
tral space\, including the detailed structure of their minimizers\, exact
characterizations of continuity\, and convergence of descent algorithms.\n
\n\n\nThis is joint work with Miroslav Dudík and Matus Telgarsky.\n\n\n\n
Bio: Robert Schapire is a Partner Researcher at Microsoft Research in New
York City. He received his PhD from MIT in 1991. After a short postdoc at
Harvard\, he joined the technical staff at AT&T Labs (formerly AT&T Bell L
aboratories) in 1991. In 2002\, he became a Professor of Computer Science
at Princeton University. He joined Microsoft Research in 2014. His awards
include the 1991 ACM Doctoral Dissertation Award\, the 2003 Gödel Prize\,
and the 2004 Kanelakkis Theory and Practice Award (both of the last two w
ith Yoav Freund). He is a fellow of the AAAI\, and a member of both the Na
tional Academy of Engineering and the National Academy of Sciences. His ma
in research interest is in theoretical and applied machine learning. Websi
te: http://rob.schapire.net/\n\n\n\nMonday\, April 24\, 2023\n\n\n\n3:30pm
– Pre-talk meet and greet teatime – Dana House\, 24 Hillhouse Avenue\
n\n\n\n4:00pm – 5:00 pm – Talk – Mason Lab 211 with the option of vi
rtual participation https://yale.hosted.panopto.com/Panopto/Pages/Viewer.a
spx?id=e1a54b37-2829-4b7a-841e-af93011fd666 \n\n\n\n\nWatch\n\n
CATEGORIES:Colloquium,Seminar Series
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:391@fds.yale.edu
DTSTART;TZID=America/New_York:20230427T160000
DTEND;TZID=America/New_York:20230427T170000
DTSTAMP:20240226T190401Z
URL:https://fds.yale.edu/events/fds-seminar-abhinav-bhardwaj-yale-math-ent
ry-wise-dissipation-for-singular-vector-perturbation-bounds/
SUMMARY:FDS Seminar: Abhinav Bhardwaj (Yale Math)\, "Entry–wise dissipati
on for singular vector perturbation bounds"
DESCRIPTION:Speaker: Abhinav Bhardwaj (Yale Math)\n\n\n\nAbstract: Consider
a random perturbation of a low rank matrix. In this talk\, we discuss ent
ry-wise bounds on the perturbation of the singular vectors (i.e\, a Davis-
Kahan type bound in the infinity norm). Among others\, our result shows th
at\, under common incoherence assumptions\, the entry-wise error is evenly
dissipated. This improves a number of previous results and has algorithmi
c applications for many well known clustering problems\, including the hid
den clique\, planted coloring\, and planted bipartition.\n\n\n\nLocation:
24 Hillhouse\, room 107\n
CATEGORIES:Seminar Series
END:VEVENT
BEGIN:VEVENT
UID:410@fds.yale.edu
DTSTART;TZID=America/New_York:20230501T080000
DTEND;TZID=America/New_York:20230501T150000
DTSTAMP:20240227T154008Z
URL:https://fds.yale.edu/?post_type=event&p=2069
SUMMARY:Workshop: Healthcare Data Science
DESCRIPTION:[caption id="attachment_1102" align="alignnone" width="1024"] H
ealthcare Data Science Workshop\, organizers and speakers [/caption]\nThis
workshop\, hosted by the Yale Institute for Foundation of Data Science\,
aims to channel the growing interest in healthcare among data scientists a
nd help bridge the information gap for data scientists and clinical invest
igators\, who have complementary expertise that can enhance scientific dis
covery and patient care.Our target audience spans Yale University and Yale
-New Haven Hospital\, representing a multidisciplinary group of clinicians
\, research scientists\, as well as students across Yale. Our plan is to d
efine the broad value of data science in healthcare\, identify the various
data streams within healthcare settings\, highlight the barriers to succe
ssful scientific investigations in this area\, and discuss ways in which c
linicians and data scientists can collaborate. We are planning to achieve
these goals through short talks and panel discussions in the morning\, fol
lowed by hands-on workshops highlighting some of the key takeaways for par
ticipants.Organized by Rohan Khera Assistant Professor of Medicine (Cardio
vascular Medicine) and of Biostatistics (Health Informatics)\;Clinical Dir
ector\, Center for Health Informatics and Analytics\, YNHH/Yale Center for
Outcomes Research &\; Evaluation (CORE)Director\, Cardiovascular Data
Science (CarDS) LabYale University\nClick here to register.\n \;Schedu
le:8:00-8:05 - Rohan Khera\, MD\, MS: Introduction8:05-9.00 - Lucila Ohno-
Machado\, MD\, PhD: &ldquo\;Keynote Presentation: The unique challenges an
d opportunities in healthcare data science&rdquo\;Session notes: The sessi
on will define why healthcare increasingly relies on data science and how
there is an incredible opportunity to improve the health of people and soc
ieties by appropriately leveraging the various data streams. It will also
identify the uniqueness of healthcare data sciences and the need for build
ing specific expertise to translate scientific discoveries to healthcare.9
.00-9.15 - Coffee Break9.15-10.15 - Marc Suchard MD\, PhD: "Data science a
cross silos in healthcare"Session notes: This solution-oriented talk by Dr
. Suchard will demonstrate the power of collaborative science for data-dri
ven discoveries while successfully tackling data silos in healthcare. The
work will highlight successful federated approaches that have proven to be
a highway to successful multicenter and multinational studies.10.15-10.45
- Smita Krishnaswamy\, PhD: &ldquo\;Strategies to Tackle Multimodal Spars
e Data in the Electronic Health Record&rdquo\;Session notes: Dr. Krishnasw
amy will provide methodological approaches to overcoming the challenges of
working with real-world healthcare data that spans multiple domains and w
ith often informative missingness. The overview will provide a way to appr
eciate how innovation in methods can solve key challenges in healthcare da
ta science.10:45-11:00 - Break11.00-12.30 - Harlan Krumholz\, MD\, SM\; Pu
neet Batra\, PhD\; &\; Panelists: &ldquo\;How to build successful clini
cian-investigator and data scientist collaborations&rdquo\;Session notes:
This interactive panel discussion will tackle how to best bridge the gap b
etween clinician investigators and data scientists to enable successful di
scovery and have a major impact. The panel will discuss the talks in the m
orning\, and where panel chairs will share their unique experiences buildi
ng these bridges as a clinician (Dr. Krumholz) and a data scientist (Dr. B
atra)\, with participation from the panel and audience.12.30-13.30 - Lunch
Break13.30-15.00 - "Hands-on EHR Workshop by the CarDS Lab"Session notes:
Members of the Cardiovascular Data Science (CarDS) Lab at Yale School of
Medicine will lead a hands-on demonstration of working with structured dat
a in the electronic health record\, the most widely available data stream
across health systems. The EHR workshop will focus on applying research be
st practices when working with the EHR. The workshop will be interactive a
nd will follow an instructor-led session format with the following learnin
g objectives:&bull\; \; \; \; \; \; \; \;
\; \; \; key issues when working with structured data from EHR&bul
l\; \; \; \; \; \; \; \; \; \; \;
how to work with common data models to design scalable studies&bull\;
\; \; \; \; \; \; \; \; \; \; how to a
pply statistical best practices to data from the EHRThe demonstration will
include real-world de-identified EHR and a web-based format that allows p
articipants to learn key analytic principles and design and test conductin
g a mini-study.Speakers:Lucila Ohno-Machado MD\, MBA\, PhDWaldemar von Zed
twitz Professor of Medicine and Biomedical Informatics and Data Science\;
Deputy Dean for Biomedical Informatics\; Chair\, Section of Biomedical Inf
ormatics and Data ScienceLucila Ohno-Machado is Deputy Dean for Biomedical
Informatics at Yale and Chair of Biomedical Informatics and Data Science.
She is a MD\, PhD\, and MBA\, and has received numerous awards for her le
adership in informatics and is an elected member of the National Academy o
f Medicine\, the American Society for Clinical Investigation\, and the Ame
rican College of Medical Informatics. Her research focuses on predictive m
odels\, data sharing\, and innovative algorithms to distribute computation
with local data.Marc Suchard\, MD\, PhDProfessor of Biostatistics\, Bioma
thematics\, &\; Human GeneticsUCLAMarc Suchard is a professor in the De
partments of Biostatistics\, Biomathematics and Human Genetics at UCLA. He
has a Medical Degree from UCLA and a PhD in Biomathematics from the same
university. Dr. Suchard is helping to develop the nascent field of evoluti
onary medicine. This field harnesses the power of methods and theory from
evolutionary biology to advance our understanding of human disease process
es. Just as phylogenetic approaches have stimulated the field of evolution
at large\, they posses the potential to revolutionize evolutionary medici
ne\, particularly in the study of rapidly evolving pathogens. To bridge th
e gap between phylogenetics and human-pathogen biology\, Dr. Suchard's int
erests focus on the development of novel reconstruction methods drawing he
avily on statistical\, mathematical and computation techniques. Some of hi
s current projects involve jointly estimating alignments and phylogenies f
rom molecular sequence data and mapping recombination hot-spots in the HIV
genome.Puneet Batra\, PhDSenior Principal\, Flagship PioneeringPuneet is
a senior principal at Flagship Pioneering where he works as part of a vent
ure-creation team leading machine learning strategy and helping advance Fl
agship portfolio companies working in generative drug design and materials
.Prior to joining Flagship Pioneering\, Puneet was the Director of Machine
Learning at the Broad Institute of Harvard &\; MIT\, where he helped f
ound the Machine Learning for Health group that broke new ground in the de
velopment and application of deep learning architectures for biological di
scovery in cardiovascular disease\, metabolic disease\, and brain health.
Prior to the Broad\, Puneet was lead scientist at Aster Data (Acq by Terad
ata).He has published in Nature Genetics\, The New England Journal of Medi
cine\, The Lancet\, and Physical Review D\, and has served as Co-PI on gra
nts from the American Heart Association\, the Department of Energy\, the N
ational Institutes of Health\, National Heart\, Lung\, and Blood Institute
\, and the Impetus Foundation. He serves on the advisory board of Our Heal
th\, a non-profit initiative to research the root causes of atheroscleroti
c disease in South Asians.Puneet completed his B.A. at Harvard University
and has a Ph.D. from Stanford University\, both in theoretical physics.Har
lan Krumholz\, MD\, SMHarold H. Hines\, Jr. Professor of Medicine (Cardiol
ogy) and Professor in the Institute for Social and Policy Studies\, of Inv
estigative Medicine and of Public Health (Health Policy)\; Director\, Cent
er for Outcomes Research and Evaluation (CORE)\; Yale UniversityHarlan Kru
mholz is a cardiologist and scientist at Yale University and Yale New Have
n Hospital\, who has been honored by membership in the National Academy of
Medicine\, the Association of American Physicians\, and the American Soci
ety for Clinical Investigation for his work to improve the quality and eff
iciency of care and eliminate disparities\, as well as co-founding the Yal
e University Open Data Access (YODA) Project\, medRxiv\, HugoHealth\, Refa
ctor Health and the American Heart Association&rsquo\;s Quality of Care an
d Outcomes Research Council. He has published more than 1400 articles and
three books with an h-index of more than 220.Smita Krishnaswamy\, PhDAssoc
iate Professor of Genetics and of Computer ScienceSmita Krishnaswamy is an
Associate Professor in the departments of Computer Science (SEAS) and Gen
etics (YSM). She is part of the programs in Applied Mathematics\, Computat
ional Biology &\; Bioinformatics and Interdisciplinary Neuroscience. Sh
e is also affiliated with the Yale Institute for the foundations of data s
cience\, Wu-Tsai Institute\, Yale Cancer Center. Smita's lab works on fund
amental deep learning and machine learning developments for representing a
nd learning from big data. Her techniques incorporate mathematical priors
from graph spectral theory\, manifold learning\, signal processing\, and t
opology into machine learning and deep learning frameworks\, in order to d
enoise and model the underlying systems faithfully for predictive insight.
Currently her methods are being widely used for data denoising\, visualiz
ation\, generative modeling\, dynamics. modeling\, comparative analysis an
d domain transfer in datasets arising from stem cell biology\, cancer\, im
munology and structural biology (among others).Smita teaches several cours
es including: Deep Learning Theory and Applications\, Unsupervised learnin
g\, and Geometric and Topological Methods in Machine Learning. Prior to jo
ining Yale\, Smita completed her postdoctoral training at Columbia Univers
ity in the systems biology department where she focused on learning comput
ational models of cellular signaling from single-cell mass cytometry data.
She obtained her Ph.D. from EECS department at University of Michigan whe
re her research focused on algorithms for automated synthesis and probabil
istic verification of nanoscale logic circuits. Following her time in Mich
igan\, Smita spent 2 years at IBM's TJ Watson Research Center as a researc
her in the systems division where she worked on automated bug finding and
error correction in logic. Smita's work over the years has won several awa
rds including the NSF CAREER Award\, Sloan Faculty Fellowship\, and Blavat
nik fund for Innovation.\n
CATEGORIES:Training,Workshops
END:VEVENT
BEGIN:VEVENT
UID:394@fds.yale.edu
DTSTART;TZID=America/New_York:20230503T160000
DTEND;TZID=America/New_York:20230503T170000
DTSTAMP:20240226T190401Z
URL:https://fds.yale.edu/events/fds-seminar-wei-ji-ma-process-models-of-co
mplex-mental-computation/
SUMMARY:FDS Seminar: Wei Ji Ma\, "Process models of complex mental computat
ion"
DESCRIPTION:"Process models of complex mental computation"\n\n\n\nSpeaker:
Wei Ji MaProfessor of Neural Science and PsychologyCenter for Neural Scien
ceNew York University\n\n\n\nLocation: 211 Mason or remotely via Panopto:
https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=77784297-c06b
-441e-b2d8-af93011fd6e8\n\n\n\nAbstract: Computational cognitive models co
mmit to a sequence of steps in which an observer/agent mentally processes
information leading up to a behavioral response. Typically\, both the mode
l parameters and the model structure have to be inferred solely from stimu
lus-response pairs. For more complex mental computations\, these inference
s tend to be more challenging\, yet potentially yield greater insights. I
will illustrate this using two examples from disparate domains. In the fir
st study\, we test whether people perform unconscious Bayesian inference i
n visual search\, specifically\, whether they marginalize over nuisance va
riables. In the second study\, we model human planning in a two-player boa
rd game using a “humanized” variant of best-first search. I will descr
ibe the methodological challenges associated with unbiased estimation of l
og likelihoods and with parameter fitting\, and our proposed solutions. \
n\n\n\nSpeaker bio: Wei Ji Ma is Professor of Neural Science and Psycholog
y at NYU. His lab studies decision-making in planning\, social cognition\,
working memory\, perception\, and attention\, using a combination of huma
n behavioral experiments\, computational modeling\, and - through collabor
ations - electrophysiology and neuroimaging. Wei Ji grew up in the Netherl
ands and received his Ph.D. in Physics from the University of Groningen. H
e continued as a postdoc in computational neuroscience\, first with Christ
of Koch at Caltech and then with Alexandre Pouget at the University of Roc
hester. He was Assistant Professor of Neuroscience at Baylor College of Me
dicine from 2008 to 2013. He has been at NYU since 2013. He has affiliate
appointments in the Neuroscience Institute\, the Institute for the Study o
f Decision Making\, the Center for Data Science\, and the Center for Exper
imental Social Science\, and is Collaborating Faculty of the NYU-ECNU Inst
itute of Brain and Cognitive Science at NYU Shanghai. With Xiao-Jing Wang\
, Wei Ji is Program Director of the NIH-funded Training Program in Computa
tional Neuroscience at NYU. Moreover\, Wei Ji is active in mentorship\, co
mmunity-building\, and outreach. He is a founding member of the Scientist
Action and Advocacy Network and of NeuWrite NYU. Wei Ji co-founded and lea
ds the Growing up in Science seminar series\, in which scientists tell the
ir "unofficial stories". Read or listen to Wei Ji's own unofficial story.
Besides his academic work\, Wei Ji is the co-founder of the Rural China Ed
ucation Foundation.\n\n\n\nHosted by John Lafferty\n
CATEGORIES:Seminar Series
LOCATION:Mason Lab 211 with remote access option\, \,
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=\, ;X-APPLE-RADIUS=100;X-TI
TLE=Mason Lab 211 with remote access option:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:390@fds.yale.edu
DTSTART;TZID=America/New_York:20230524T120000
DTEND;TZID=America/New_York:20230524T130000
DTSTAMP:20240226T190400Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-xifan-yu-algor
ithmic-lower-bounds-for-expansion-profile-of-regular-graphs/
SUMMARY:Yale Theory Student Seminar: Xifan Yu\, "Algorithmic Lower Bounds f
or Expansion Profile of Regular Graphs"
DESCRIPTION:"Algorithmic Lower Bounds for Expansion Profile of Regular Grap
hs"\n\n\n\nSpeaker: Xifan Yu\n\n\n\nThis is a place for theory-minded stud
ents and postdocs to gather for a weekly lunch seminar. We meet on Wednesd
ays at 12 (for lunch) and the talk starts at 12:15. The presentations are
on papers\, results\, conjectures\, or anything theory-oriented. In order
to keep things more casual and interactive\, presentations are on the boar
d. We meet at the YINS common room\, on the 3rd floor of 17 Hillhouse.\n\n
\n\nhttps://yaletheorystudents.github.io/\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
END:VEVENT
BEGIN:VEVENT
UID:388@fds.yale.edu
DTSTART;TZID=America/New_York:20230531T120000
DTEND;TZID=America/New_York:20230531T131500
DTSTAMP:20240226T190400Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-asaf-etgar-on-
the-connectivity-and-diameter-of-geodetic-graphs/
SUMMARY:Yale Theory Student Seminar: Asaf Etgar\, "On the Connectivity and
Diameter of Geodetic Graphs"
DESCRIPTION:Abstract: Geodetic Graphs are graphs in which any two vertices
are connected by a unique shortest path. In 1962\, Ore asked to characteri
ze this fundamental family.of graphs. Despite many attempts\, such charact
erization seems beyond reach. In this talk we present some history of geod
etic graphs\, some constructions - and a result that\, under reasonable as
sumptions\, limits the structure of geodetic graphs - taking another step
towards characterization.\n\n\n\nhttps://yaletheorystudents.github.io/\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
END:VEVENT
BEGIN:VEVENT
UID:375@fds.yale.edu
DTSTART;TZID=America/New_York:20230607T120000
DTEND;TZID=America/New_York:20230607T130000
DTSTAMP:20240226T190355Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-series-jane-le
e-statistics-without-iid-samples-learning-from-truncated-data/
SUMMARY:Yale Theory Student Seminar Series: Jane Lee\, "Statistics Without
iid Samples: Learning From Truncated Data"
DESCRIPTION:Website: https://yaletheorystudents.github.io/\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
LOCATION:Yale Institute for Foundations of Data Science Common Area\, Kline
Tower 13th Floor\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ne
w Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITLE=Yale Insti
tute for Foundations of Data Science Common Area:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:325@fds.yale.edu
DTSTART;TZID=America/New_York:20230614T120000
DTEND;TZID=America/New_York:20230614T130000
DTSTAMP:20240226T190341Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-jinzhao-wu-on-
the-optimal-fixed-price-mechanism-in-bilateral-trade/
SUMMARY:Yale Theory Student Seminar: Jinzhao Wu\, "On the Optimal Fixed–P
rice Mechanism in Bilateral Trade"
DESCRIPTION:If you are interested in joining the mailing list\, please reac
h out to Marco Pirazzini (marco.pirazzini@yale.edu) or Siddharth Mitra (si
ddharth.mitra@yale.edu).\n\n\n\nYale Theory Student Seminar Website\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
LOCATION:Yale Institute for Foundations of Data Science Common Area\, Kline
Tower 13th Floor\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ne
w Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITLE=Yale Insti
tute for Foundations of Data Science Common Area:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:324@fds.yale.edu
DTSTART;TZID=America/New_York:20230705T120000
DTEND;TZID=America/New_York:20230705T130000
DTSTAMP:20240226T190341Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-siddharth-mitr
a-on-single-cell-trajectory-inference/
SUMMARY:Yale Theory Student Seminar: Siddharth Mitra\, "On Single–cell Tr
ajectory Inference"
DESCRIPTION:If you are interested in joining the mailing list\, please reac
h out to Marco Pirazzini (marco.pirazzini@yale.edu) or Siddharth Mitra (si
ddharth.mitra@yale.edu).\n\n\n\nYale Theory Student Seminar Website\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
LOCATION:Yale Institute for Foundations of Data Science Common Area\, Kline
Tower 13th Floor\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ne
w Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITLE=Yale Insti
tute for Foundations of Data Science Common Area:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:386@fds.yale.edu
DTSTART;TZID=America/New_York:20230712T120000
DTEND;TZID=America/New_York:20230712T130000
DTSTAMP:20240226T190359Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-asaf-etgar-on-
graphs-and-geometry/
SUMMARY:Yale Theory Student Seminar: Asaf Etgar\, "On Graphs and Geometry"
DESCRIPTION:https://yaletheorystudents.github.io/\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
END:VEVENT
BEGIN:VEVENT
UID:339@fds.yale.edu
DTSTART;TZID=America/New_York:20230719T120000
DTEND;TZID=America/New_York:20230719T130000
DTSTAMP:20240226T190345Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-marco-pirazzin
i-on-the-small-set-expansion-hypothesis/
SUMMARY:Yale Theory Student Seminar: Marco Pirazzini\, "On the Small Set Ex
pansion Hypothesis"
DESCRIPTION:If you are interested in joining the mailing list\, please reac
h out to Marco Pirazzini (marco.pirazzini@yale.edu) or Siddharth Mitra (si
ddharth.mitra@yale.edu).\n\n\n\nYale Theory Student Seminar Website\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
LOCATION:Yale Institute for Foundations of Data Science Common Area\, Kline
Tower 13th Floor\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ne
w Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITLE=Yale Insti
tute for Foundations of Data Science Common Area:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:323@fds.yale.edu
DTSTART;TZID=America/New_York:20230726T120000
DTEND;TZID=America/New_York:20230726T130000
DTSTAMP:20240226T190340Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-khashayar-gatm
iry-mit-sampling-with-barriers-faster-mixing-via-lewis-weights/
SUMMARY:Yale Theory Student Seminar: Khashayar Gatmiry (MIT)\, "Sampling wi
th Barriers: Faster Mixing via Lewis Weights"
DESCRIPTION:If you are interested in joining the mailing list\, please reac
h out to Marco Pirazzini (marco.pirazzini@yale.edu) or Siddharth Mitra (si
ddharth.mitra@yale.edu).\n\n\n\nYale Theory Student Seminar Website\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
LOCATION:Yale Institute for Foundations of Data Science Common Area\, Kline
Tower 13th Floor\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ne
w Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITLE=Yale Insti
tute for Foundations of Data Science Common Area:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:322@fds.yale.edu
DTSTART;TZID=America/New_York:20230816T120000
DTEND;TZID=America/New_York:20230816T130000
DTSTAMP:20240226T190340Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-john-lazarsfel
d-decentralized-learning-dynamics-in-the-gossip-model/
SUMMARY:Yale Theory Student Seminar: John Lazarsfeld\, "Decentralized Learn
ing Dynamics in the Gossip Model"
DESCRIPTION:If you are interested in joining the mailing list\, please reac
h out to Marco Pirazzini (marco.pirazzini@yale.edu) or Siddharth Mitra (si
ddharth.mitra@yale.edu).\n\n\n\nYale Theory Student Seminar Website\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
LOCATION:Yale Institute for Foundations of Data Science Common Area\, Kline
Tower 13th Floor\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ne
w Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITLE=Yale Insti
tute for Foundations of Data Science Common Area:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:321@fds.yale.edu
DTSTART;TZID=America/New_York:20230823T120000
DTEND;TZID=America/New_York:20230823T130000
DTSTAMP:20240226T190340Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-aditi-laddha-d
eterminant-maximization-via-local-search/
SUMMARY:Yale Theory Student Seminar: Aditi Laddha\, "Determinant Maximizati
on via Local Search"
DESCRIPTION:If you are interested in joining the mailing list\, please reac
h out to Marco Pirazzini (marco.pirazzini@yale.edu) or Siddharth Mitra (si
ddharth.mitra@yale.edu).\n\n\n\nYale Theory Student Seminar Website\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
LOCATION:Yale Institute for Foundations of Data Science Common Area\, Kline
Tower 13th Floor\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ne
w Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITLE=Yale Insti
tute for Foundations of Data Science Common Area:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:385@fds.yale.edu
DTSTART;TZID=America/New_York:20230829T140000
DTEND;TZID=America/New_York:20230829T150000
DTSTAMP:20240226T190359Z
URL:https://fds.yale.edu/events/data-science-project-match-2/
SUMMARY:Data Science Project Match
DESCRIPTION:An opportunity for students to match with data science research
opportunities presented by Yale faculty.\n\n\n\nOpening Remarks & Introdu
ction \n\n\n\nby Daniel SpielmanSterling Professor of Computer Science\; P
rofessor of Statistics & Data Science\, and of MathematicsJames A. Attwood
Director of the Institute for Foundations of Data Science at Yale (FDS)\n
\n\n\nProject Presentations\n\n\n\nRohan Khera\, MD\, MSDirector\, Cardiov
ascular Data Science (CarDS) LabAssistant Professor\, Cardiovascular Medic
ine\, Yale School of Medicinerohan.khera@yale.edu | CarDS-Lab.org\n\n\n\n"
Innovating Cardiovascular Care with Multimodality Data Science"The Cardiov
ascular Data Science (CarDS) Lab at Yale leverages advances in deep learni
ng and AI to enhance and automate care. The work uses numerous data stream
s in the electronic health record and focuses on natural language processi
ng\, federated learning\, signal processing\, and computer vision for enha
nced inference\, and develops and deploys novel convolutional neural netwo
rks and transformer models to address care challenges. The experience is i
deal for students interested in health tech and/or medicine and looking to
gain from a longitudinal research experience.\n\n\n\nJennifer MarlonSenio
r Research Scientist\, School of the EnvironmentDirector of Data Science\,
Yale Program on Climate Change CommunicationLecturer\, Department of Mole
cular\, Cellular and Developmental Biologyjennifer.marlon@yale.edu | https
://environment.yale.edu/profile/jennifer-marlon\n\n\n\n“Using paleofire
records and global fire simulations to understand wildfire responses to cl
imate change and human activities”Jennifer Marlon\, Nicholas O'Mara\, Ca
rla StaverOver the last several years unusually large and severe wildfires
have devastated communities and wildlife and transformed ecosystems aroun
d the globe. This project reconstructs and analyzes long-term fire and veg
etation records from ice and lake sediment cores for comparison with dynam
ic global fire model simulations. We seek a data analyst/database engineer
to help develop the paleofire records and the SQL database that will hous
e them. The research assistant (RA) will use R and SQL to generate composi
te records of regional to global wildfire activity spanning thousands of y
ears of Earth’s history. The RA will have the opportunity to participate
in bi-weekly project meetings\, to present scientific results to a team o
f international\, interdisciplinary collaborators\, and to co-author peer-
reviewed publications.\n\n\n\nIlias ZadikAssistant Professor\, Department
of Statistics and Data Science Ilias.zadik@yale.edu | https://iliaszadik.
github.io/\n\n\n\n"MCMC methods for pooled testing"In pooled or group test
ing\, which was of high importance over the recent COVID-19 pandemic\, one
tests subsets of a population of individuals with the goal to detect the
subset of infected ones using as few as possible total number of tests. On
e of the simplest yet information-theoretically optimal (in terms of numbe
r of total number of tests used)\, such testing procedures is to choose th
e individuals participating in each test independently at random. This is
a simple implication of the so-called probabilistic method. Yet\, besides
the simplicity of its procedure\, multiple natural computationally efficie
nt procedures that have been mathematically proven to require a larger num
ber of tests. Interestingly\, MCMC methods have never been mathematically
analyzed for this setting and have shown intriguing success in (small scal
e) simulations. This project\, as part of a general goal of build tools to
analyze MCMC methods for statistical tasks\, aims to understand (empirica
lly in large scale and ideally mathematically establish) the performance o
f natural MCMC methods for this important group testing scheme.\n\n\n\nSoh
eil GhiliAssistant Professor of Marketing\, School of Managementsoheil.ghi
li@yale.edu | https://sites.google.com/view/soheil-ghili/\n\n\n\n“Traini
ng Large Language Models for Price Negotiation”Price negotiation in acad
emia is mostly examined within the field of economics and in environments
in which each party to the negotiation has a simple set of moves available
: accept/reject the offer made\, or counter-offer a price. In this study\,
we aim to take a step further and train models for negotiation in environ
ment in which each party’s moves entail generating a text that not only
contains an offer\, but also supports it with information and reasoning. A
n important aspect of our objectives in training LLMs for this task is tha
t they learn the game theoretical aspects. To illustrate\, a seller LLM th
at has info indicating its product is of high value is expected to share t
hat info as part of its offer\, while a seller that knows its product has
lower quality is expected to remain silent about the quality aspect. In th
e initial stages of the project\, we will try to train LLMs for simpler ta
sks\; and we will build toward the ultimate goal of price negotiation over
time.\n\n\n\nAlfred P. Kaye\, MD PhDAssistant Professor\, Department of P
sychiatry\, Yale University School of Medicinealfred.kaye@yale.edu | https
://www.kayelab.com/\n\n\n\n"Neural representation of threat" In this proje
ct\, we have recorded from large numbers of neurons in the mouse prefronta
l cortex as a mouse navigates through the environment. These optical recor
dings of neurons can be used to infer the animal's level of threat percept
ion in virtual environments with differing levels of safety. The neural re
presentation can then be used to predict behavior\, while accounting for o
ther variables such as arousal\, locomotion\, and other task-related measu
res. Thus\, a student interested in working on this project can apply nonl
inear dimensionality reduction and ML approaches to understand how neurons
encode information about emotionally related variables in the world.\n\n\
n\nLu LuAssistant Professor of Statistics and Data ScienceLu.lu@yale.edu |
https://lu.seas.upenn.edu\n\n\n\n“Physics-informed neural operators for
fast prediction of multiscale systems”High-fidelity simulations like di
rect numerical simulation (DNS) of turbulence and molecular dynamics (MD)
of atomistic systems are computationally very expensive and data-intensive
. Furthermore\, for multiscale problems\, the microscale component is so e
xpensive that it has stalled progress in simulating time-dependent atomist
ic-continuum systems. These open issues\, in turn\, have delayed progress
in forecasting of real-time dynamics in critical applications such as auto
nomy\, extreme weather patterns\, and designing efficiently new functional
materials. Scientific machine learning (SciML) has the potential to total
ly reverse this rather inefficient paradigm and significantly accelerate s
cientific discovery with direct impact on technology in the next few decad
es. We propose to develop a new generation of neural operators\, universal
approximators for operators\, that can learn explicit and implicit operat
ors from data only. To this end\, we need to extend the predictability of
neural operators for unseen out-of-distribution inputs and to speed-up the
training process via high performance and multi-GPU computing. We will en
dow neural operators with physics\, multifidelity data\, and equivariant p
rinciples (e.g.\, geometric equivariance and conservation laws) for contin
uum systems and with seamless coupling for hybrid continuum-molecular syst
ems\, where neural operators will replace the expensive molecular componen
t.\n\n\n\nSteven KleinsteinAnthony N Brady Professor of Pathology. Departm
ent of Pathology\, Yale School of Medicine. Department of Immunobiology.st
even.kleinstein@yale.eduProject presented by Gisela Gabernet\, Associate R
esearch Scientist at the Kleinstein Labgisela.gabernet@yale.edu | https://
medicine.yale.edu/lab/kleinstein/\n\n\n\n“Identifying convergent antibod
y responses across infections and auto-immune diseases”The development o
f antibodies that target and neutralize pathogens is an important facet of
the adaptive immune response to foreign pathogens. Antibodies are generat
ed through the recombination of Variable\, Diversity and Joining gene segm
ents at the DNA level\, with additional targeted mutations that generate a
theoretical antibody diversity of 1014 unique sequences. Despite this hig
h diversity\, a bias in the usage of these gene segments or even antibodie
s with overall high sequence similarity – denominated convergent antibod
ies – have been observed across cohorts of patients after an immune chal
lenge such as vaccination\, infection or auto-immune diseases. Convergent
antibodies have been described to target conserved epitopes across mutagen
ic pathogens such as HIV and influenza\, showing a potential towards the d
evelopment of broadly protective vaccines. They have also been observed in
auto-immune diseases\, potentially serving as diagnostics and monitoring
markers. In our lab\, we have developed a high-throughput analysis pipelin
e that enables the efficient processing of antibody repertoires of individ
ual cohorts (https://nf-co.re/airrflow). This project will aim at benchmar
king and improving current convergent antibody detection methods as well a
s visualizations. One potential approach will involve modelling the antibo
dy sequences as a network of sequence similarity and identifying regions i
n the network shared across multiple subjects.\n\n\n\nHemant TagareProfess
or of Radiology and Biomedical Imaging and of Biomedical Engineeringhemant
.tagare@yale.edu | https://medicine.yale.edu/profile/hemant-tagare/\n\n\n\
n“Predict the progression of Parkinson’s Disease”Parkinson’s Disea
se (PD) is the fastest growing neurodegenerative disease in the world. PD
is also heterogeneous – different patients progress at different rates a
long different trajectories. Predicting the patient-specific progress of P
D is critical in treating the disease and in shortening the length of clin
ical trials for new PD therapies. Currently\, there are no reliable method
s to predict PD progression. The goal of this research is to use a large d
ataset of PD patients to predict PD progress from baseline data. The datas
et has images\, clinical scores\, wearables data\, lab reports\, and genet
ic information. The challenge is to use this heterogeneous data to create
an accurate prediction model. All methods (frequentist\, Bayesian\, deep l
earning) are welcome.\n\n\n\nDavid van Dijk\, Ph.D.Assistant Professor of
Medicine\, Yale School of MedicineAssistant Professor of Computer Scienced
avid.vandijk@yale.edu | vandijklab.org\n\n\n\n"Using Machine Learning to u
nderstand the language of biology"Recent advances in large language models
provide new opportunities for decoding biology. Single-cell omics data en
codes complex cellular behaviors and processes into high-dimensional molec
ular profiles. By treating these data as textual representations\, we can
apply and fine-tune neural language models to uncover the underlying gramm
atical rules governing biological systems. We have demonstrated that these
models can learn to translate between species\, matching cell types and g
ene expression programs between mice and humans in a completely unsupervis
ed fashion. This cross-species translation highlights how fundamental aspe
cts of biology form a universal language translatable across organisms. Mo
re broadly\, interpreting single cell data as “biological text” enable
s leveraging powerful natural language processing approaches to find patte
rns\, generate hypotheses\, and gain conceptual understanding of biology.\
n\n\n\nZhuoran YangAssistant Professor\, Department of Statistics & Data S
ciencezhuoran.yang@yale.edu | https://statistics.yale.edu/people/zhuoran-y
ang\n\n\n\n"What and How does In-Context Learning Learn? Bayesian Model Av
eraging\, Parameterization\, and Generalization"Large language models demo
nstrate an in-context learning (ICL) ability\, i.e.\, they can learn from
a few examples provided in the prompt without updating their parameters. I
n this project\, we conduct a comprehensive study of ICL\, addressing seve
ral open questions:(a) What type of ICL estimator is learned within langua
ge models?(b) What are the suitable performance metrics to evaluate ICL ac
curately\, and what are their associated error rates?(c) How does the tran
sformer architecture facilitate ICL?To address (a)\, we adopt a Bayesian p
erspective and demonstrate that ICL implicitly implements the Bayesian mod
el averaging algorithm. This Bayesian model averaging algorithm is shown t
o be approximated by the attention mechanism. For (b)\, we analyze ICL per
formance from an online learning standpoint and establish a sublinear regr
et bound. This shows that the error diminishes as the number of examples i
n the prompt increases. Regarding (c)\, beyond the encoded Bayesian model
averaging algorithm in the attention mechanism\, we reveal that during pre
training\, the total variation distance between the learned model and the
nominal model is bounded by the sum of an approximation error and a genera
lization error.Our findings aim to offer a unified understanding of the tr
ansformer and its ICL capability\, with bounds on ICL regret\, approximati
on\, and generalization. This deepens our comprehension of these crucial f
acets of modern language models and illuminates advanced prompt methodolog
ies for tackling more complex reasoning tasks.\n\n\n\nTong WangAssistant P
rofessor of Marketing\, School of Management\, Yale Universitytong.wang.tw
687@yale.edu | https://tongwang-ai.github.io/\n\n\n\n"Exploring Post Hoc I
nterpretation of Representations for Unstructured Data"In recent years\, d
eep learning has emerged as the prevailing solution for tackling decision-
making tasks involving unstructured data\, such as images and texts. The e
fficacy of any predictive undertaking related to unstructured data hinges
upon the caliber of their representation in the latent space—often refer
red to as embeddings. In essence\, the pivotal question revolves around wh
ether an insightful portrayal of unstructured data can be attained\, one t
hat encapsulates pertinent information for downstream tasks. Our objective
is to delve into the realm of post hoc interpretation concerning these re
presentations\, contextualizing our exploration within various domains\, i
ncluding business and medical data. Through an analytical lens\, we seek t
o unveil the concealed insights nestled within latent representations\, th
ereby discerning the origins of the informational cues present in the trai
ning data. It is noteworthy that a portion of this endeavor enjoys sponsor
ship from NSF and is executed in close collaboration with the esteemed May
o Clinic.\n\n\n\n\n\n\n\nRefreshments will be served\n
CATEGORIES:Project Match,Training
LOCATION:Yale Institute for Foundations of Data Science\, Kline Tower 13th
Floor\, Room 1327\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ro
om 1327\, New Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITL
E=Yale Institute for Foundations of Data Science:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:335@fds.yale.edu
DTSTART;TZID=America/New_York:20230830T120000
DTEND;TZID=America/New_York:20230830T130000
DTSTAMP:20240226T190344Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-gaurav-mahajan
-some-open-problems-in-tcs/
SUMMARY:Yale Theory Student Seminar: Gaurav Mahajan\, "Some Open Problems i
n TCS"
DESCRIPTION:If you are interested in joining the mailing list\, please reac
h out to Marco Pirazzini (marco.pirazzini@yale.edu) or Siddharth Mitra (si
ddharth.mitra@yale.edu).\n\n\n\nYale Theory Student Seminar Website\n
CATEGORIES:Student Led Seminar,Seminar Series,Training
LOCATION:Yale Institute for Foundations of Data Science Common Area\, Kline
Tower 13th Floor\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ne
w Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITLE=Yale Insti
tute for Foundations of Data Science Common Area:geo:0,0
END:VEVENT
BEGIN:VEVENT
UID:387@fds.yale.edu
DTSTART;TZID=America/New_York:20230906T120000
DTEND;TZID=America/New_York:20230906T131500
DTSTAMP:20240226T190359Z
URL:https://fds.yale.edu/events/yale-theory-student-seminar-alkis-kalavasi
s-some-open-problems-in-tcs/
SUMMARY:Yale Theory Student Seminar: Alkis Kalavasis\, "Some Open Problems
in TCS"
DESCRIPTION:Abstract: \n\n\n\n"Overview of the things I am interested in (M
achine Learning & Optimization)"\n\n\n\nQuestion 1 (TCS): Query Complexity
of MaxCut and beyond.\n\n\n\nQuestion 2 (Computational Learning Theory):
Introduction to Quantum learning theory and open questions.\n\n\n\nWebsite
: https://yaletheorystudents.github.io/\n
CATEGORIES:Student Led Seminar,Seminar Series,Summer Seminar,Training
LOCATION:Yale Institute for Foundations of Data Science Common Area\, Kline
Tower 13th Floor\, New Haven\, CT\, 06511\, United States
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Kline Tower 13th Floor\, Ne
w Haven\, CT\, 06511\, United States;X-APPLE-RADIUS=100;X-TITLE=Yale Insti
tute for Foundations of Data Science Common Area:geo:0,0
END:VEVENT
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:STANDARD
DTSTART:20221106T010000
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20230312T030000
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
END:DAYLIGHT
END:VTIMEZONE
END:VCALENDAR