SOTAVerified

Probabilistic Pontryagin's Maximum Principle for Continuous-Time Model-Based Reinforcement Learning

2025-04-03Code Available0· sign in to hype

David Leeftink, Çağatay Yıldız, Steffen Ridderbusch, Max Hinne, Marcel van Gerven

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Without exact knowledge of the true system dynamics, optimal control of non-linear continuous-time systems requires careful treatment of epistemic uncertainty. In this work, we propose a probabilistic extension to Pontryagin's maximum principle by minimizing the mean Hamiltonian with respect to epistemic uncertainty. We show minimization of the mean Hamiltonian is a necessary optimality condition when optimizing the mean cost, and propose a multiple shooting numerical method scalable to large-scale probabilistic dynamical models, including ensemble neural ordinary differential equations. Comparisons against state-of-the-art methods in online and offline model-based reinforcement learning tasks show that our probabilistic Hamiltonian formulation leads to reduced trial costs in offline settings and achieves competitive performance in online scenarios. By bridging optimal control and reinforcement learning, our approach offers a principled and practical framework for controlling uncertain systems with learned dynamics.

Tasks

Reproductions