Events
FDS Statistics & Data Science Seminar
Efficiently learning and sampling from multimodal distributions using data-based initialization
![]() |
Speaker: Thuy-Duong "June" Vuong (Miller Institute) Postdoctoral Fellow Miller Institute, Berkeley Monday, March 24, 2025 3:30PM - 5:00PM 3:30pm - Pre-talk meet and greet teatime on the 11th floor
4:00pm - talk in 1307 Location: Yale Institute for Foundations of Data Science, Kline Tower 13th Floor, Room 1327, New Haven, CT 06511 and via Webcast: https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=ea4e0f03-5e28-48a7-b343-b233012bce08 |
Abstract: Learning to sample is a central task in generative AI: the goal is to generate (infinitely many more) samples from a target distribution $\mu$ given a small number of samples from $\mu.$ It is well-known that traditional algorithms such as Glauber or Langevin dynamics are highly inefficient when the target distribution is multimodal, as they take exponential time to converge from a \emph{worst case start}, while recently proposed algorithms such as denoising diffusion (DDPM) require information that is computationally hard to learn. In this talk, we propose a novel and conceptually simple algorithmic framework to learn multimodal target distributions by initializing traditional sampling algorithms at the empirical distribution. As applications, we show new results for two representative distribution families: Gaussian mixtures and Ising models. When the target distribution $\mu$ is a mixture of $k$ well-conditioned Gaussians, we show that the (continuous) Langevin dynamics initialized from the empirical distribution over $\tilde{O}(k/\epsilon^2)$ samples, with high probability over the samples, converge to $\mu” in $\tilde{O}(1)$-time; both the number of samples and convergence time are optimal. When $\mu$ is a low-complexity Ising model, we show a similar result for the Glauber dynamics with approximate marginals learned via pseudolikelihood estimation, demonstrating for the first time that such low-complexity Ising models can be efficiently learned from samples.”
Based on joint work with Frederic Koehler and Holden Lee.
Speaker bio: I am a postdoctoral fellow at the Miller Institute, Berkeley, hosted by Alistair Sinclair.
I have a broad interest in theoretical computer science. My current research interest is in algorithms for sampling from complex high-dimensional distributions, with applications to statistical physics, generative AIs, and other fields.
I received my PhD student in Computer Science at Stanford University in 2024, advised by Nima Anari and Moses Charikar. My PhD was partially supported by a Microsoft Research PhD Fellowship (2022-2024).
I received my Bachelor’s degrees in Mathematics and Computer Science at the Massachusetts Institute of Technology (MIT) in 2019.
I am joining UC San Diego CSE as an assistant professor in January 2026.
https://thuyduongvuong.github.io
Add To: Google Calendar | Outlook | iCal File
Submit an Event
Interested in creating your own event, or have an event to share? Please fill the form if you’d like to send us an event you’d like to have added to the calendar.