Intrinsically Guided Exploration in Meta Reinforcement Learning

2021-01-01Unverified0· sign in to hype

Jin Zhang, Jianhao Wang, Hao Hu, Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang

Unverified — Be the first to reproduce this paper.

Abstract

Deep reinforcement learning algorithms generally require large amounts of data to solve a single task. Meta reinforcement learning (meta-RL) agents learn to adapt to novel unseen tasks with high sample efficiency by extracting useful prior knowledge from previous tasks. Despite recent progress, efficient exploration in meta-training and adaptation remains a key challenge in sparse-reward meta-RL tasks. We propose a novel off-policy meta-RL algorithm to address this problem, which disentangles exploration and exploitation policies and learns intrinsically motivated exploration behaviors. We design novel intrinsic rewards derived from information gain to reduce task uncertainty and encourage the explorer to collect informative trajectories about the current task. Experimental evaluation shows that our algorithm achieves state-of-the-art performance on various sparse-reward MuJoCo locomotion tasks and more complex Meta-World tasks.

Tasks

Deep Reinforcement Learning Efficient Exploration Meta Reinforcement Learning MuJoCo reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)

Intrinsically Guided Exploration in Meta Reinforcement Learning

Abstract

Tasks

Reproductions