SOTAVerified

Distributional Reinforcement Learning with Quantile Regression

2017-10-27Code Available0· sign in to hype

Will Dabney, Mark Rowland, Marc G. Bellemare, Rémi Munos

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In reinforcement learning an agent interacts with the environment by taking actions and observing the next state and reward. When sampled probabilistically, these state transitions, rewards, and actions can all induce randomness in the observed long-term return. Traditionally, reinforcement learning algorithms average over this randomness to estimate the value function. In this paper, we build on recent work advocating a distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean. That is, we examine methods of learning the value distribution instead of the value function. We give results that close a number of gaps between the theoretical and algorithmic results given by Bellemare, Dabney, and Munos (2017). First, we extend existing results to the approximate distribution setting. Second, we present a novel distributional reinforcement learning algorithm consistent with our theoretical formulation. Finally, we evaluate this new algorithm on the Atari 2600 games, observing that it significantly outperforms many of the recent improvements on DQN, including the related distributional algorithm C51.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
Atari 2600 AlienQR-DQN-1Score4,871Unverified
Atari 2600 AmidarQR-DQN-1Score1,641Unverified
Atari 2600 AssaultQR-DQN-1Score22,012Unverified
Atari 2600 AsterixQR-DQN-1Score261,025Unverified
Atari 2600 AsteroidsQR-DQN-1Score4,226Unverified
Atari 2600 AtlantisQR-DQN-1Score971,850Unverified
Atari 2600 Bank HeistQR-DQN-1Score1,249Unverified
Atari 2600 Battle ZoneQR-DQN-1Score39,268Unverified
Atari 2600 Beam RiderQR-DQN-1Score34,821Unverified
Atari 2600 BerzerkQR-DQN-1Score3,117Unverified
Atari 2600 BowlingQR-DQN-1Score77.2Unverified
Atari 2600 BoxingQR-DQN-1Score99.9Unverified
Atari 2600 BreakoutQR-DQN-1Score742Unverified
Atari 2600 CentipedeQR-DQN-1Score12,447Unverified
Atari 2600 Chopper CommandQR-DQN-1Score14,667Unverified
Atari 2600 Crazy ClimberQR-DQN-1Score161,196Unverified
Atari 2600 DefenderQR-DQN-1Score47,887Unverified
Atari 2600 Demon AttackQR-DQN-1Score121,551Unverified
Atari 2600 Double DunkQR-DQN-1Score21.9Unverified
Atari 2600 EnduroQR-DQN-1Score2,355Unverified
Atari 2600 Fishing DerbyQR-DQN-1Score39Unverified
Atari 2600 FreewayQR-DQN-1Score34Unverified
Atari 2600 FrostbiteQR-DQN-1Score4,384Unverified
Atari 2600 GopherQR-DQN-1Score113,585Unverified
Atari 2600 GravitarQR-DQN-1Score995Unverified
Atari 2600 HEROQR-DQN-1Score21,395Unverified
Atari 2600 Ice HockeyQR-DQN-1Score-1.7Unverified
Atari 2600 James BondQR-DQN-1Score4,703Unverified
Atari 2600 KangarooQR-DQN-1Score15,356Unverified
Atari 2600 KrullQR-DQN-1Score11,447Unverified
Atari 2600 Kung-Fu MasterQR-DQN-1Score76,642Unverified
Atari 2600 Montezuma's RevengeQR-DQN-1Score0Unverified
Atari 2600 Ms. PacmanQR-DQN-1Score5,821Unverified
Atari 2600 Name This GameQR-DQN-1Score21,890Unverified
Atari 2600 PhoenixQR-DQN-1Score16,585Unverified
Atari 2600 Pitfall!QR-DQN-1Score0Unverified
Atari 2600 PongQR-DQN-1Score21Unverified
Atari 2600 Private EyeQR-DQN-1Score350Unverified
Atari 2600 Q*BertQR-DQN-1Score572,510Unverified
Atari 2600 River RaidQR-DQN-1Score17,571Unverified
Atari 2600 Road RunnerQR-DQN-1Score64,262Unverified
Atari 2600 RobotankQR-DQN-1Score59.4Unverified
Atari 2600 SeaquestQR-DQN-1Score8,268Unverified
Atari 2600 SkiingQR-DQN-1Score-9,324Unverified
Atari 2600 SolarisQR-DQN-1Score6,740Unverified
Atari 2600 Space InvadersQR-DQN-1Score20,972Unverified
Atari 2600 Star GunnerQR-DQN-1Score77,495Unverified
Atari 2600 SurroundQR-DQN-1Score8.2Unverified
Atari 2600 TennisQR-DQN-1Score23.6Unverified
Atari 2600 Time PilotQR-DQN-1Score10,345Unverified

Reproductions