Meta Reinforcement Learning for Fast Adaptation of Hierarchical Policies

2021-05-21NeurIPS 2021Unverified0· sign in to hype

David Kuric, Herke van Hoof

Unverified — Be the first to reproduce this paper.

Abstract

Hierarchical methods have the potential to allow reinforcement learning to scale to larger environments. Decomposing a task into transferable components, however, remains a challenging problem. In this paper, we propose a meta-learning approach for learning such a decomposition within the options framework. We formulate the objective as a bi-level optimization problem in which sub-policies and their terminations should facilitate fast learning on a family of tasks. Once such a set of options is obtained, it can then be used in new tasks where only the sequencing of options needs to be chosen. Our formalism tends to result in options where fewer decisions are needed to solve such new tasks. Experimentally, we show that our method is able to learn transferable components which accelerate learning and performs better than existing methods developed for this setting in the challenging ant maze locomotion task.

Tasks

Meta-Learning Meta Reinforcement Learning reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)

Meta Reinforcement Learning for Fast Adaptation of Hierarchical Policies

Abstract

Tasks

Reproductions