Unity is Power: Semi-Asynchronous Collaborative Training of Large-Scale Models with Structured Pruning in Resource-Limited Clients

2024-10-11Unverified0· sign in to hype

Yan Li, Mingyi Li, Xiao Zhang, Guangwei Xu, Feng Chen, Yuan Yuan, Yifei Zou, Mengying Zhao, Jianbo Lu, Dongxiao Yu

Unverified — Be the first to reproduce this paper.

Abstract

In this work, we study to release the potential of massive heterogeneous weak computing power to collaboratively train large-scale models on dispersed datasets. In order to improve both efficiency and accuracy in resource-adaptive collaborative learning, we take the first step to consider the unstructured pruning, varying submodel architectures, knowledge loss, and straggler challenges simultaneously. We propose a novel semi-asynchronous collaborative training framework, namely Co-S^2P, with data distribution-aware structured pruning and cross-block knowledge transfer mechanism to address the above concerns. Furthermore, we provide theoretical proof that Co-S^2P can achieve asymptotic optimal convergence rate of O(1/N^*EQ). Finally, we conduct extensive experiments on a real-world hardware testbed, in which 16 heterogeneous Jetson devices can be united to train large-scale models with parameters up to 0.11 billion. The experimental results demonstrate that Co-S^2P improves accuracy by up to 8.8\% and resource utilization by up to 1.2 compared to state-of-the-art methods, while reducing memory consumption by approximately 22\% and training time by about 24\% on all resource-limited devices.

Tasks

Transfer Learning Unity

Unity is Power: Semi-Asynchronous Collaborative Training of Large-Scale Models with Structured Pruning in Resource-Limited Clients

Abstract

Tasks

Reproductions