skip to main content

PhD Defense

Monday, April 23, 2018
1:00pm to 3:00pm
Add to Cal
Annenberg 121
Exploiting Structure for Scalable and Robust Deep Learning
Stephan Zhang, Computing and Mathematical Sciences, Caltech,

Deep learning has seen great success in training neural networks for complex prediction problems, such as large-scale image recognition, time-series forecasting and learning single-agent behavioral models. However, neural networks have a number of weaknesses: 1) they are not sample-efficient and 2) they are often not robust against (adversarial) input perturbations. Hence, it is challenging to apply deep learning to problems with exponential complexity, such as multi-agent games, complex long-term spatiotemporal dynamics or noisy high-resolution data.

To address these issues, I will present methods that exploit structure to improve the sample efficiency, expressive power and robustness of neural networks in both supervised and reinforcement learning paradigms.

First, I will introduce hierarchical neural networks that model both short-term actions and long-term goals, and can learn human-level behavioral models from data for spatiotemporal multi-agent games, such as basketball. Second, I will show that behavioral policies with a hierarchical latent structure enable a form of structured exploration for faster reinforcement learning, by exploring only learned forms of multi-agent coordination.

Third, I will present tensor-train recurrent neural networks that can model high-order mutliplicative structure in dynamical systems, and give state-of-the-art long-term forecasting performance.

Finally, I will showcase stability training, a form of stochastic data augmentation to make neural networks more robust against weak adversarial perturbations, and highlight neural fingerprinting, a method to detect adversarial examples.