SOTAVerified

Robust Q-Learning for finite ambiguity sets

2024-07-05Code Available0· sign in to hype

Cécile Decker, Julian Sester

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In this paper we propose a novel Q-learning algorithm allowing to solve distributionally robust Markov decision problems for which the ambiguity set of probability measures can be chosen arbitrarily as long as it comprises only a finite amount of measures. Therefore, our approach goes beyond the well-studied cases involving ambiguity sets of balls around some reference measure with the distance to reference measure being measured with respect to the Wasserstein distance or the Kullback--Leibler divergence. Hence, our approach allows the applicant to create ambiguity sets better tailored to her needs and to solve the associated robust Markov decision problem via a Q-learning algorithm whose convergence is guaranteed by our main result. Moreover, we showcase in several numerical experiments the tractability of our approach.

Tasks

Reproductions