Count-Based Exploration with Neural Density Models

2017-03-03ICML 2017Code Available0· sign in to hype

Georg Ostrovski, Marc G. Bellemare, Aaron van den Oord, Remi Munos

Code Available — Be the first to reproduce this paper.

Code

github.com/nolisten/erl
tf★ 0

Abstract

Bellemare et al. (2016) introduced the notion of a pseudo-count, derived from a density model, to generalize count-based exploration to non-tabular reinforcement learning. This pseudo-count was used to generate an exploration bonus for a DQN agent and combined with a mixed Monte Carlo update was sufficient to achieve state of the art on the Atari 2600 game Montezuma's Revenge. We consider two questions left open by their work: First, how important is the quality of the density model for exploration? Second, what role does the Monte Carlo update play in exploration? We answer the first question by demonstrating the use of PixelCNN, an advanced neural density model for images, to supply a pseudo-count. In particular, we examine the intrinsic difficulties in adapting Bellemare et al.'s approach when assumptions about the model are violated. The result is a more practical and general algorithm requiring no special apparatus. We combine PixelCNN pseudo-counts with different agent architectures to dramatically improve the state of the art on several hard Atari games. One surprising finding is that the mixed Monte Carlo update is a powerful facilitator of exploration in the sparsest of settings, including Montezuma's Revenge.

Tasks

Atari Games Montezuma's Revenge Reinforcement Learning

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Atari 2600 Freeway	DQN-CTS	Score	33	—	Unverified
Atari 2600 Freeway	DQN-PixelCNN	Score	31.7	—	Unverified
Atari 2600 Gravitar	DQN-PixelCNN	Score	498.3	—	Unverified
Atari 2600 Gravitar	DQN-CTS	Score	238	—	Unverified
Atari 2600 Montezuma's Revenge	DQN-PixelCNN	Score	3,705.5	—	Unverified
Atari 2600 Private Eye	DQN-CTS	Score	206	—	Unverified
Atari 2600 Private Eye	DQN-PixelCNN	Score	8,358.7	—	Unverified
Atari 2600 Venture	DQN-PixelCNN	Score	82.2	—	Unverified
Atari 2600 Venture	DQN-CTS	Score	48	—	Unverified

Count-Based Exploration with Neural Density Models

Code

Abstract

Tasks

Benchmark Results

Reproductions