Statistics & Data Science Seminar

Gradient Descent Dominates Ridge: A Statistical View on Implicit Regularization

Speaker: Jingfeng Wu (Berkeley)

Postdoctoral Fellow at the Simons Institute

UC Berkeley

Monday, September 22, 2025

3:30PM - 5:00PM

Teatime at 3:30pm in 1307
Talk at 4:00-5:00pm in 1327

Location: Yale Institute for Foundations of Data Science, Kline Tower 13th Floor, Room 1327, New Haven, CT 06511 and via Webcast: https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=dca8dff4-f559-44f7-8928-b34600d4f9b6

Talk summary: A key puzzle in deep learning is how simple gradient methods find generalizable solutions without explicit regularization. This talk discusses the implicit regularization of gradient descent (GD) through the lens of statistical dominance. Using least squares as a clean proxy, we present two surprising findings.  First, GD dominates ridge regression. For any well-specified Gaussian least squares problem, the finite-sample excess risk of optimally stopped GD is no more than a constant times that of optimally tuned ridge regression. However, there is a natural subset of these problems where GD achieves a polynomially smaller excess risk. Thus, implicit regularization is statistically superior to explicit regularization, in addition to its computational advantages.

Second, GD and online stochastic gradient descent (SGD) are incomparable. We construct a sequence of well-specified Gaussian least squares problems where optimally stopped GD is polynomially worse than online SGD, and similarly vice versa. Our construction leverages a key insight from benign overfitting, revealing a fundamental separation between batch and online learning.

This is joint work with Peter Bartlett, Sham Kakade, Jason Lee, and Bin Yu.

Speaker bio: Jingfeng Wu is a Postdoctoral Fellow at the Simons Institute for the Theory of Computing at UC Berkeley, where he is hosted by Peter Bartlett and Bin Yu. He is a member of the NSF/Simons Collaboration on the Theoretical Foundations of Deep Learning. Wu received his Ph.D. in Computer Science from Johns Hopkins University, where he was advised by Vladimir Braverman.

Website.

Add To: Google Calendar | Outlook | iCal File

  • Statistics & Data Science Seminar

Submit an Event

Interested in creating your own event, or have an event to share? Please fill the form if you’d like to send us an event you’d like to have added to the calendar.

Submit an Event

Share your event ideas with us using the form below.

"*" indicates required fields

MM slash DD slash YYYY
Start Time*
:
End Time*
: