Deep Mutual Learning
Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/huanghoujing/AlignedReID-Re-Production-Pytorchpytorch★ 0
- github.com/aquvitae/aquvitaetf★ 0
- github.com/shubhamtyagii/Aligned_Reidpytorch★ 0
- github.com/pilsHan/DML_for-personal-studypytorch★ 0
- github.com/pilsHan/DMLpytorch★ 0
- github.com/h4veFunCodin9/Aligned_ReIDpytorch★ 0
- github.com/abhishirk/Aligned_ReIdpytorch★ 0
Abstract
Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this paper, we present a deep mutual learning (DML) strategy where, rather than one way transfer between a static pre-defined teacher and a student, an ensemble of students learn collaboratively and teach each other throughout the training process. Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on CIFAR-100 recognition and Market-1501 person re-identification benchmarks. Surprisingly, it is revealed that no prior powerful teacher network is necessary -- mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher.