Representation Convergence: Mutual Distillation is Secretly a Form of Regularization

2025-01-05Code Available0· sign in to hype

Zhengpeng Xie, Jiahang Cao, Qiang Zhang, Jianxiong Zhang, Changwei Wang, Renjing Xu

Code Available — Be the first to reproduce this paper.

Code

github.com/myrepositories-hub/mutual-distillation-policy-optimization
OfficialIn paperpytorch★ 1

Abstract

In this paper, we argue that mutual distillation between reinforcement learning policies serves as an implicit regularization, preventing them from overfitting to irrelevant features. We highlight two key contributions: (a) Theoretically, for the first time, we prove that enhancing the policy robustness to irrelevant features leads to improved generalization performance. (b) Empirically, we demonstrate that mutual distillation between policies contributes to such robustness, enabling the spontaneous emergence of invariant representations over pixel inputs. Overall, our findings challenge the conventional view of distillation as merely a means of knowledge transfer, offering a novel perspective on the generalization in deep reinforcement learning.

Tasks

Deep Reinforcement Learning Form reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)Representation Learning Transfer Learning

Representation Convergence: Mutual Distillation is Secretly a Form of Regularization

Code

Abstract

Tasks

Reproductions