Special Seminar in Applied Mathematics
February 2, 2016
Efficient Spectral Methods for Learning Mixture Models
Laboratory for Information & Decision Systems,
Massachusetts Institute of Technology
Mixture models are a class of probabilistic models that are powerful for various applications. For example Gaussian mixture models (GMMs) for unsupervised learning or clustering, Hidden Markov models (HMMs) for natural language processing, and super-imposed complex sinusoids in super-resolution microscopy. The learning problem, namely estimating the model parameters from observed samples is known to be hard, as the latent variables make maximum likelihood estimation (MLE) a non-convex optimization. Theoretically, there exist worst case hardness results for learning the above three classes of mixture models; while in practice, heuristics such as EM algorithms usually do not come with performance guarantees.
In the first part of this talk, I will present new algorithms for learning these three classes of mixture models. The algorithms rely on spectral methods capable of exploiting the structural properties of these mixture models. For non-worst-cases, we prove that to achieve a target estimation accuracy, the algorithms only require a polynomial number of samples and has polynomial running time. The proofs leverage recent development in random matrix theory.
In the second part of this talk, I will present our recent results on estimating low rank probability matrices, which is at the core of learning many mixture models, including HMMs, topic models and SBMs. Our algorithm achieves linear sample complexity, which we prove to be minimax optimal. This suggests that spectral method based algorithms may achieve the statistical efficiency of MLE with computation efficiency.
Based on joint works with Munther Dahleh, Rong Ge, Sham Kakade, Greg Valiant.
Special Seminars in Applied Mathematics