Events
Statistics & Data Science Seminar
"Structured Topic Modeling: Leveraging Sparsity and Graphs For Improved Inference"
|
Speaker: Claire Donnat (Chicago) Assistant Professor of Statistics at the University of Chicago University of Chicago Monday, November 3, 2025 3:30PM - 5:00PM Teatime at 3:30pm in 1307
Talk at 4:00pm in 1327 Location: Yale Institute for Foundations of Data Science, Kline Tower 13th Floor, Room 1327, New Haven, CT 06511 and via Webcast: https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=153489ad-83ae-490e-b86d-b361010cd81b |
Abstract: Classical topic models (LDA, pLSI) treat documents as independent, which wastes information when texts are short or vocabularies are large. I will present two structured alternatives with statistical guarantees. First, a weakly sparse extension of pLSI that stabilizes estimation in high-vocabulary settings by shrinking rare terms without enforcing hard zeros. Second, a graph-aligned singular value decomposition that incorporates known relationships between documents—e.g., spatial proximity
or sample similarity—to improve recovery of document–topic and topic–word matrices. For both methods we derive non-asymptotic, high-probability error bounds for topic proportions and word distributions. Applications to spatial proteomics, microbiome profiles, and scientific abstracts show accuracy and interpretability gains when side information is available. The talk highlights when structure helps, how to encode it, and what guarantees are achievable.
Keywords Topic Modeling · Constrained SVD
Add To: Google Calendar | Outlook | iCal File
- Statistics & Data Science Seminar
Submit an Event
Interested in creating your own event, or have an event to share? Please fill the form if you’d like to send us an event you’d like to have added to the calendar.
