skip to main content

Rigorous Systems Research Group (RSRG) Seminar

Wednesday, August 3, 2016
12:00pm to 1:00pm
Add to Cal
Annenberg 213
Variational Inference: From Artificial Temperatures to Stochastic Gradients
Stephan Mandt, Disney Research,
Bayesian modeling is a popular approach to solving machine learning problems.  In this talk, we will first review variational inference, where we map Bayesian inference to an optimization problem. This optimization problem is non-convex, meaning that there are many local optima that correspond to poor fits of the data.   We first show that by introducing a "local temperature" to every data point and applying the machinery of variational inference, we can avoid some of these poor optima, suppress the effects of outliers, and ultimately find more meaningful patterns. In the second part of the talk, we will then present a Bayesian view on one of the most important machine learning algorithms: Stochastic Gradient Descent (SGD). When operated with a constant, non-decreasing learning rates, SGD first marches towards the optimum of the objective and then samples from a stationary distribution that is centered around the optimum. As such, SGD resembles Markov Chain Monte Carlo (MCMC) algorithms which, after a burn-in period, draw samples from a Bayesian posterior. Drawing on the tools of variational inference, we investigate and formalize this connection. Our analysis reveals criteria that allow us to use SGD as an approximate scalable MCMC algorithm that can compete with more complicated state-of-the-art Bayesian approaches.

 

For more information, please contact Sydney Garstang by email at [email protected].