SOTAVerified

Will it Blend? Composing Value Functions in Reinforcement Learning

2018-07-12Unverified0· sign in to hype

Benjamin van Niekerk, Steven James, Adam Earle, Benjamin Rosman

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

An important property for lifelong-learning agents is the ability to combine existing skills to solve unseen tasks. In general, however, it is unclear how to compose skills in a principled way. We provide a "recipe" for optimal value function composition in entropy-regularised reinforcement learning (RL) and then extend this to the standard RL setting. Composition is demonstrated in a video game environment, where an agent with an existing library of policies is able to solve new tasks without the need for further learning.

Tasks

Reproductions