Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning

2021-09-29ICLR 2022Unverified0· sign in to hype

Yutong Wang, Ke Xue, Chao Qian

Unverified — Be the first to reproduce this paper.

Abstract

Reinforcement Learning (RL) has achieved significant successes, which aims to obtain a single policy maximizing the expected cumulative rewards for a given task. However, in many real-world scenarios, e.g., navigating in complex environments and controlling robots, one may need to find a set of policies having both high rewards and diverse behaviors, which can bring better exploration and robust few-shot adaptation. Recently, some methods have been developed by using evolutionary techniques, including iterative reproduction and selection of policies. However, due to the inefficient selection mechanisms, these methods cannot fully guarantee both high quality and diversity. In this paper, we propose EDO-CS, a new Evolutionary Diversity Optimization algorithm with Clustering-based Selection. In each iteration, the policies are divided into several clusters based on their behaviors, and a high-quality policy is selected from each cluster for reproduction. EDO-CS also adaptively balances the importance between quality and diversity in the reproduction process. Experiments on various (i.e., deceptive and multi-modal) continuous control tasks, show the superior performance of EDO-CS over previous methods, i.e., EDO-CS can achieve a set of policies with both high quality and diversity efficiently while previous methods cannot.

Tasks

Clustering continuous-control Continuous Control Diversity reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)

Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning

Abstract

Tasks

Reproductions