Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction

2019-05-01ICLR 2019Unverified0· sign in to hype

Youngwoon Lee*, Shao-Hua Sun*, Sriram Somasundaram, Edward Hu, Joseph J. Lim

Unverified — Be the first to reproduce this paper.

Abstract

Intelligent creatures acquire complex skills by exploiting previously learned skills and learning to transition between them. To empower machines with this ability, we propose transition policies which effectively connect primitive skills to perform sequential tasks without handcrafted rewards. To effectively train our transition policies, we introduce proximity predictors which induce rewards gauging proximity to suitable initial states for the next skill. The proposed method is evaluated on a diverse set of experiments for continuous control in both bi-pedal locomotion and robotic arm manipulation tasks in MuJoCo. We demonstrate that transition policies enable us to effectively learn complex tasks and the induced proximity reward computed using the initiation predictor improves training efficiency. Videos of policies learned by our algorithm and baselines can be found at https://sites.google.com/view/transitions-iclr2019 .

Tasks

continuous-control Continuous Control MuJoCo

Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction

Abstract

Tasks

Reproductions