SOTAVerified

ADDQ: Adaptive Distributional Double Q-Learning

2025-06-24Code Available0· sign in to hype

Leif Döring, Benedikt Wille, Maximilian Birr, Mihail Bîrsan, Martin Slowik

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Bias problems in the estimation of Q-values are a well-known obstacle that slows down convergence of Q-learning and actor-critic methods. One of the reasons of the success of modern RL algorithms is partially a direct or indirect overestimation reduction mechanism. We propose an easy to implement method built on top of distributional reinforcement learning (DRL) algorithms to deal with the overestimation in a locally adaptive way. Our framework is simple to implement, existing distributional algorithms can be improved with a few lines of code. We provide theoretical evidence and use double Q-learning to show how to include locally adaptive overestimation control in existing algorithms. Experiments are provided for tabular, Atari, and MuJoCo environments.

Tasks

Reproductions