SOTAVerified

Learning Robust Controllers Via Probabilistic Model-Based Policy Search

2021-10-26Unverified0· sign in to hype

Valentin Charvet, Bjørn Sand Jensen, Roderick Murray-Smith

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Model-based Reinforcement Learning estimates the true environment through a world model in order to approximate the optimal policy. This family of algorithms usually benefits from better sample efficiency than their model-free counterparts. We investigate whether controllers learned in such a way are robust and able to generalize under small perturbations of the environment. Our work is inspired by the PILCO algorithm, a method for probabilistic policy search. We show that enforcing a lower bound to the likelihood noise in the Gaussian Process dynamics model regularizes the policy updates and yields more robust controllers. We demonstrate the empirical benefits of our method in a simulation benchmark.

Tasks

Reproductions