SOTAVerified

Scalable Evolution Strategies Pipeline for Solving the Vehicle Routing Problem

2020-10-17NeurIPS Workshop LMCA 2020Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

As a general framework for applying deep learning methods to solve a problem, Deep Reinforcement Learning (RL) has many applications. In this paper we study Deep RL as it applies to the Vehicle Routing Problem (VRP). Specifically, we focus on the capacitated variant of the VRP (CVRP), in which vehicles have a maximum carrying capacity and customers have varied demands. Currently in the literature, there are quite a few papers in which researchers have applied Deep RL to the CVRP. While the methods developed are able to produce solutions to problems fairly quickly, so far, they all use GPUs to train the models, which reduces scalability. Recently, OpenAI released a study on comparing Evolution Strategies (ES) with classic Deep RL training methods, such as Policy Gradient (PG), and found that ES uses less resources and performs similarly to state-of-the-art Deep RL training methods. The main benefit of this is that ES can be trained on CPUs in parallel, which costs less than training on a GPU. In light of this, we are motivated to replace traditional RL training methods in the research with ES for comparison.

Tasks

Reproductions