SOTAVerified

MoPE: Mixture of Prompt Experts for Parameter-Efficient and Scalable Multimodal Fusion

2024-03-14Code Available1· sign in to hype

Ruixiang Jiang, Lingbo Liu, Changwen Chen

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Despite the demonstrated parameter efficiency of prompt-based multimodal fusion methods, their limited adaptivity and expressiveness often result in suboptimal performance compared to other tuning approaches. In this paper, we introduce the Mixture of Prompt Experts (MoPE), the first technique designed to overcome these limitations by decomposing standard prompts to capture instance-level features adaptively. Building on this decomposition, MoPE enhances prompt fusion's expressiveness by leveraging multimodal pairing priors to route the most effective prompt for each instance dynamically. Compared to vanilla prompting, our MoPE-based fusion method exhibits greater expressiveness, scaling more effectively with the training data and the overall number of trainable parameters. We also investigate regularization terms for expert routing, which lead to emergent expert specialization with enhanced adaptiveness and interpretablity. Extensive experiments across six multimodal datasets spanning four modalities demonstrate state-of-the-art performance for prompt fusion, matching or even surpassing the performance of fine-tuning while requiring only 0.8% of the trainable parameters. Project homepage: https://github.com/songrise/MoPE

Tasks

Reproductions