PhD Defense
Deep learning has seen great success in training neural networks for complex prediction problems, such as large-scale image recognition, time-series forecasting and learning single-agent behavioral models. However, neural networks have a number of weaknesses: 1) they are not sample-efficient and 2) they are often not robust against (adversarial) input perturbations. Hence, it is challenging to apply deep learning to problems with exponential complexity, such as multi-agent games, complex long-term spatiotemporal dynamics or noisy high-resolution data.
To address these issues, I will present methods that exploit structure to improve the sample efficiency, expressive power and robustness of neural networks in both supervised and reinforcement learning paradigms.
First, I will introduce hierarchical neural networks that model both short-term actions and long-term goals, and can learn human-level behavioral models from data for spatiotemporal multi-agent games, such as basketball. Second, I will show that behavioral policies with a hierarchical latent structure enable a form of structured exploration for faster reinforcement learning, by exploring only learned forms of multi-agent coordination.
Third, I will present tensor-train recurrent neural networks that can model high-order mutliplicative structure in dynamical systems, and give state-of-the-art long-term forecasting performance.
Finally, I will showcase stability training, a form of stochastic data augmentation to make neural networks more robust against weak adversarial perturbations, and highlight neural fingerprinting, a method to detect adversarial examples.