G^2D: Boosting Multimodal Learning with Gradient-Guided Distillation
Mohammed Rakib, Arunkumar Bagavathi
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/raison-lab/g2dOfficialIn paperpytorch★ 3
Abstract
Multimodal learning aims to leverage information from diverse data modalities to achieve more comprehensive performance. However, conventional multimodal models often suffer from modality imbalance, where one or a few modalities dominate model optimization, leading to suboptimal feature representation and underutilization of weak modalities. To address this challenge, we introduce Gradient-Guided Distillation (G^2D), a knowledge distillation framework that optimizes the multimodal model with a custom-built loss function that fuses both unimodal and multimodal objectives. G^2D further incorporates a dynamic sequential modality prioritization (SMP) technique in the learning process to ensure each modality leads the learning process, avoiding the pitfall of stronger modalities overshadowing weaker ones. We validate G^2D on multiple real-world datasets and show that G^2D amplifies the significance of weak modalities while training and outperforms state-of-the-art methods in classification and regression tasks. Our code is available at https://github.com/rAIson-Lab/G2D.