SOTAVerified

Learning to Control Visual Abstractions for Structured Exploration in Deep Reinforcement Learning

2019-05-01ICLR 2019Unverified0· sign in to hype

catalin ionescu, tejas kulkarni, aaron van de oord, andriy mnih, Vlad Mnih

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Exploration in environments with sparse rewards is a key challenge for reinforcement learning. How do we design agents with generic inductive biases so that they can explore in a consistent manner instead of just using local exploration schemes like epsilon-greedy? We propose an unsupervised reinforcement learning agent which learns a discrete pixel grouping model that preserves spatial geometry of the sensors and implicitly of the environment as well. We use this representation to derive geometric intrinsic reward functions, like centroid coordinates and area, and learn policies to control each one of them with off-policy learning. These policies form a basis set of behaviors (options) which allows us explore in a consistent way and use them in a hierarchical reinforcement learning setup to solve for extrinsically defined rewards. We show that our approach can scale to a variety of domains with competitive performance, including navigation in 3D environments and Atari games with sparse rewards.

Tasks

Reproductions