Efficient Training of Large-Scale AI Models Through Federated Mixture-of-Experts: A System-Level Approach

2025-07-08Unverified0· sign in to hype

Xiaobing Chen, Boyang Zhang, Xiangwei Zhou, Mingxuan Sun, Shuai Zhang, Songyang Zhang, Geoffrey Ye Li

Unverified — Be the first to reproduce this paper.

Abstract

The integration of Federated Learning (FL) and Mixture-of-Experts (MoE) presents a compelling pathway for training more powerful, large-scale artificial intelligence models (LAMs) on decentralized data while preserving privacy. However, efficient federated training of these complex MoE-structured LAMs is hindered by significant system-level challenges, particularly in managing the interplay between heterogeneous client resources and the sophisticated coordination required for numerous specialized experts. This article highlights a critical, yet underexplored concept: the absence of robust quantitative strategies for dynamic client-expert alignment that holistically considers varying client capacities and the imperative for system-wise load balancing. Specifically, we propose a conceptual system design for intelligent client-expert alignment that incorporates dynamic fitness scoring, global expert load monitoring, and client capacity profiling. By tackling these systemic issues, we can unlock more scalable, efficient, and robust training mechanisms with fewer communication rounds for convergence, paving the way for the widespread deployment of large-scale federated MoE-structured LAMs in edge computing with ultra-high communication efficiency.

Tasks

Edge-computing Federated Learning Mixture-of-Experts

Efficient Training of Large-Scale AI Models Through Federated Mixture-of-Experts: A System-Level Approach

Abstract

Tasks

Reproductions