skip to main content

CMX Lunch Seminar

Wednesday, February 28, 2024
12:00pm to 1:00pm
Add to Cal
Annenberg 213
A PDE-Based Bellman Equation for Continuous-time Reinforcement Learning
Yuhua Zhu, Assistant Professor of Mathematics, Department of Mathematics and Halicioğlu Data Science Institute., University of California San Diego,

In this talk, we address the problem of continuous-time reinforcement learning in scenarios where the dynamics follow a stochastic differential equation. When the underlying dynamics remain unknown and we have access only to discrete-time information, how can we effectively conduct policy evaluation? We begin by highlighting that the commonly used Bellman equation is not always a reliable approximation to the true value function. We then introduce PhiBE, a PDE-based Bellman equation that offers a more accurate approximation to the true value function, especially in scenarios where the underlying dynamics change slowly. Moreover, we extend PhiBE to higher orders, providing increasingly accurate approximations. Additionally, we present a numerical algorithm based on the Galerkin method, tailored for solving PhiBE when only discrete-time trajectory data is available. Numerical experiments are provided to validate the theoretical guarantees we propose.

For more information, please contact Jolene Brink by phone at (626)395-2813 or by email at [email protected] or visit CMX Website.