On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists

2024-09-20Code Available0· sign in to hype

Dongyang Fan, Bettina Messmer, Martin Jaggi

Code Available — Be the first to reproduce this paper.

Code

github.com/epfml/comigs
OfficialIn paperpytorch★ 2

Abstract

On-device LLMs have gained increasing attention for their ability to enhance privacy and provide a personalized user experience. To facilitate learning with private and scarce local data, federated learning has become a standard approach, though it introduces challenges related to system and data heterogeneity among end users. As a solution, we propose a novel Collaborative learning approach with a Mixture of Generalists and Specialists (CoMiGS), being the first to effectively address both. Our approach distinguishes generalists and specialists by aggregating certain experts across end users while keeping others localized to specialize in user-specific datasets. A key innovation of our method is the bi-level optimization formulation of the Mixture-of-Experts learning objective, where the router is updated using a separate validation set that represents the target distribution. CoMiGS effectively balances collaboration and personalization, as demonstrated by its superior performance in scenarios with high data heterogeneity across multiple datasets. By design, our approach accommodates users' varying computational resources through different numbers of specialists. By decoupling resource abundance from data quantity, CoMiGS remains robust against overfitting-due to the generalists' regularizing effect-while adapting to local data through specialist expertise.

Tasks

Federated Learning Language Modeling Language Modelling Mixture-of-Experts

On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists

Code

Abstract

Tasks

Reproductions