H.B. Keller Colloquium

Monday April 26, 2021 4:00 PM

Reinforcement Learning for Complex Environments: Tree Search, Function Approximators and Markov Games

Speaker: Qiaomin Xie, School of Operations Research and Information Engineering, Cornell University
Location: Online Event

Recent literature has witnessed much progress on the algorithmic and theoretical foundations of Reinforcement Learning (RL), particularly for single-agent problems with small state/action spaces. Our understanding and algorithm toolbox for RL under complex environments, however, remain relatively limited. In this talk, I will discuss some of my work on scalable and probably efficient RL for the challenging settings with large spaces and multiple strategic agents. 

First, I will focus on simulation-based methods, as exemplified by Monte-Carlo Tree Search (MCTS). MCTS is a powerful paradigm for online planning that enjoys remarkable empirical success, but lacks theoretical understanding. We provide a complete and rigorous non-asymptotic analysis of MCTS. Our analysis develops a general framework based on a hierarchy of bandits, and highlights the importance of using a non-standard confidence bound (also used by AlphaGo) for convergence. I will further discuss combining MCTS with supervised learning and its generalization to continuous action space. 

In the second part of the talk, I will discuss on-policy RL for zero-sum Markov games, which generalizes Markov decision processes to multi-agent settings. We consider function approximation to deal with continuous and unbounded state spaces. Based on a fruitful marriage with algorithmic game theory, we develop the first computational efficient algorithm for this setting, with a provable regret bound that is independent of the cardinality and ambient dimension of the state space.

Series H. B. Keller Colloquium Series

Contact: Diana Bohler at 626-232-6138 dbohler@caltech.edu