Gradient Surgery for Multi-Task Learning

2020-01-19NeurIPS 2020Code Available1· sign in to hype

Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

Code Available — Be the first to reproduce this paper.

Code

github.com/tianheyu927/PCGrad
OfficialIn papertf★ 352
github.com/PaddlePaddle/PaddleScience
paddle★ 437
github.com/WeiChengTseng/Pytorch-PCGrad
pytorch★ 396
github.com/torchjd/torchjd
pytorch★ 307
github.com/avivnavon/nash-mtl
pytorch★ 239
github.com/cranial-xix/famo
pytorch★ 120
github.com/wgchang/PCGrad-pytorch-example
pytorch★ 31
github.com/grtzsohalf/SpeechNet-codebase
pytorch★ 21
github.com/OrthoDex/PCGrad-PyTorch
pytorch★ 13
github.com/rangwani-harsh/PC_Grad_Pytorch
pytorch★ 5

Abstract

While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge. Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks to enable more efficient learning. However, the multi-task setting presents a number of optimization challenges, making it difficult to realize large efficiency gains compared to learning tasks independently. The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood. In this work, we identify a set of three conditions of the multi-task optimization landscape that cause detrimental gradient interference, and develop a simple yet general approach for avoiding such interference between task gradients. We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient. On a series of challenging multi-task supervised and multi-task RL problems, this approach leads to substantial gains in efficiency and performance. Further, it is model-agnostic and can be combined with previously-proposed multi-task architectures for enhanced performance.

Tasks

Deep Reinforcement Learning image-classification Image Classification Multi-Task Learning Reinforcement Learning Reinforcement Learning (RL)

Gradient Surgery for Multi-Task Learning

Code

Abstract

Tasks

Reproductions