Theory of Computing Seminar

Friday February 20, 2015 12:00 PM

Learning Arbitrary Statistical Mixtures of Discrete Distributions

Speaker: Yuval Rabani, The Hebrew University of Jerusalem
Location: Annenberg 213

Abstract:

We study the problem of learning from unlabeled samples very general
statistical mixture models on large finite sets. Specifically, we want to learn a model which is a
mixture of distributions $p$ on $\{1,2,\dots,n\}$. When we sample from the model, we do not observe a
specific $p$ from the mixture directly, but rather indirectly and in a very noisy fashion—we
observe $K$ independent samples of $\{1,2,\dots,n\}$ according to a distribution $p$. We give
efficient algorithms for learning this mixture model without making any restricting assumptions on the
structure of the mixture. We bound the quality of the solution as a function of the size of the
samples $K$ and the number of samples used. Our model and results have applications to a variety of
unsupervised learning scenarios, including learning topic models and collaborative filtering.

Most of the talk is based on a joint paper with Jian Li, Leonard J. Schulman, and Chaitanya Swamy.
We will also mention some earlier work with some of the above coauthors. 
Series Theory of Computing Seminar Series

Contact: Thomas Vidick vidick@cms.caltech.edu