Back to Upcoming EventsThis Event has Passed
FDS Statistics & Data Science Seminar

Efficiently learning and sampling from multimodal distributions using data-based initialization

Speaker: Thuy-Duong "June" Vuong (Miller Institute)

Postdoctoral Fellow

Miller Institute, Berkeley

Monday, March 24, 2025

3:30PM - 5:00PM

3:30pm - Pre-talk meet and greet teatime on the 11th floor
4:00pm - talk in 1307

Location: Yale Institute for Foundations of Data Science, Kline Tower 13th Floor, Room 1327, New Haven, CT 06511 and via Webcast: https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=ea4e0f03-5e28-48a7-b343-b233012bce08

Abstract: Learning to sample is a central task in generative AI: the goal is to generate (infinitely many more) samples from a target distribution $\mu$ given a small number of samples from $\mu.$ It is well-known that traditional algorithms such as Glauber or Langevin dynamics are highly inefficient when the target distribution is multimodal, as they take exponential time to converge from a \emph{worst case start}, while recently proposed algorithms such as denoising diffusion (DDPM) require information that is computationally hard to learn. In this talk, we propose a novel and conceptually simple algorithmic framework to learn multimodal target distributions by initializing traditional sampling algorithms at the empirical distribution. As applications, we show new results for two representative distribution families: Gaussian mixtures and Ising models. When the target distribution $\mu$ is a mixture of $k$ well-conditioned Gaussians, we show that the (continuous) Langevin dynamics initialized from the empirical distribution over $\tilde{O}(k/\epsilon^2)$ samples, with high probability over the samples, converge to $\mu” in $\tilde{O}(1)$-time; both the number of samples and convergence time are optimal. When $\mu$ is a low-complexity Ising model, we show a similar result for the Glauber dynamics with approximate marginals learned via pseudolikelihood estimation, demonstrating for the first time that such low-complexity Ising models can be efficiently learned from samples.”

Based on joint work with Frederic Koehler and Holden Lee.

Speaker bio: I am a postdoctoral fellow at the Miller Institute, Berkeley, hosted by Alistair Sinclair.
I have a broad interest in theoretical computer science. My current research interest is in algorithms for sampling from complex high-dimensional distributions, with applications to statistical physics, generative AIs, and other fields.
I received my PhD student in Computer Science at Stanford University in 2024, advised by Nima Anari and Moses Charikar. My PhD was partially supported by a Microsoft Research PhD Fellowship (2022-2024).
I received my Bachelor’s degrees in Mathematics and Computer Science at the Massachusetts Institute of Technology (MIT) in 2019.
I am joining UC San Diego CSE as an assistant professor in January 2026.

https://thuyduongvuong.github.io

Add To: Google Calendar | Outlook | iCal File

Submit an Event

Interested in creating your own event, or have an event to share? Please fill the form if you’d like to send us an event you’d like to have added to the calendar.

Submit an Event

Share your event ideas with us using the form below.

"*" indicates required fields

MM slash DD slash YYYY
Start Time*
:
End Time*
: