Optimization Aspects of Temporal Abstraction in Reinforcement Learning
Temporal abstraction refers to the idea that complicated sequential decision making problems can sometimes be simplified by considering the "big picture" first. In this talk, I will give an overview of some of my work on learning such temporal abstractions end-to-end within the "option-critic" architecture (Bacon et al., 2017). I will then explain how other related hierarchical RL frameworks, such as Feudal RL by Dayan and Hinton (1993), can also be approached under the same option-critic architecture. However, we will see that that this formulation leads to a so-called "bilevel" optimization problem. While this is a more difficult problem, the good news is that the literature on bilevel optimization is rich and many of its tools have yet to be re-discovered by our community. I will finally show how "iterative differentiation" techniques (Griewank and Walther, 2008) can be applied to our problem while providing a new interpretation to the "inverse RL" approach of Rust (1988).
Contact: Pamela Albertson email@example.com