A Distributional View on Multi-Objective Policy Optimization

2020-05-15Unverified0· sign in to hype

Abbas Abdolmaleki, Sandy H. Huang, Leonard Hasenclever, Michael Neunert, H. Francis Song, Martina Zambelli, Murilo F. Martins, Nicolas Heess, Raia Hadsell, Martin Riedmiller

arXiv PDF

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Many real-world problems require trading off multiple competing objectives. However, these objectives are often in different units and/or scales, which can make it challenging for practitioners to express numerical preferences over objectives in their native units. In this paper we propose a novel algorithm for multi-objective reinforcement learning that enables setting desired preferences for objectives in a scale-invariant way. We propose to learn an action distribution for each objective, and we use supervised learning to fit a parametric policy to a combination of these distributions. We demonstrate the effectiveness of our approach on challenging high-dimensional real and simulated robotics tasks, and show that setting different preferences in our framework allows us to trace out the space of nondominated solutions.

Tasks

Multi-Objective Reinforcement Learning Reinforcement Learning (RL)

A Distributional View on Multi-Objective Policy Optimization

Abstract

Tasks

Reproductions