QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning

2019-10-02Code Available0· sign in to hype

Srivatsan Krishnan, Maximilian Lam, Sharad Chitlangia, Zishen Wan, Gabriel Barth-Maron, Aleksandra Faust, Vijay Janapa Reddi

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/harvard-edge/quarl
OfficialIn papertf★ 0

Abstract

Deep reinforcement learning continues to show tremendous potential in achieving task-level autonomy, however, its computational and energy demands remain prohibitively high. In this paper, we tackle this problem by applying quantization to reinforcement learning. To that end, we introduce a novel Reinforcement Learning (RL) training paradigm, ActorQ, to speed up actor-learner distributed RL training. ActorQ leverages 8-bit quantized actors to speed up data collection without affecting learning convergence. Our quantized distributed RL training system, ActorQ, demonstrates end-to-end speedups between 1.5 and 5.41, and faster convergence over full precision training on a range of tasks (Deepmind Control Suite) and different RL algorithms (D4PG, DQN). Furthermore, we compare the carbon emissions (Kgs of CO2) of ActorQ versus standard reinforcement learning algorithms on various tasks. Across various settings, we show that ActorQ enables more environmentally friendly reinforcement learning by achieving carbon emission improvements between 1.9 and 3.76 compared to training RL-agents in full-precision. We believe that this is the first of many future works on enabling computationally energy-efficient and sustainable reinforcement learning. The source code is available here for the public to use: https://github.com/harvard-edge/QuaRL.

Tasks

Decision Making Deep Reinforcement Learning Quantization reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)

QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning

Code

Abstract

Tasks

Reproductions