SOTAVerified

Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning

2025-06-05Unverified0· sign in to hype

Arian Raje, Baris Askin, Divyansh Jhunjhunwala, Gauri Joshi

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Large language models (LLMs) have not yet effectively leveraged the vast amounts of edge-device data, and federated learning (FL) offers a promising paradigm to collaboratively fine-tune LLMs without transferring private edge data to the cloud. To operate within the computation and communication constraints of edge devices, recent literature on federated fine-tuning of LLMs proposes the use of low-rank adaptation (LoRA) and similar parameter-efficient methods. However, LoRA-based methods suffer from accuracy degradation in FL settings, primarily because of data and computational heterogeneity across clients. We propose Ravan, an adaptive multi-head LoRA method that balances parameter efficiency and model expressivity by reparameterizing the weight updates as the sum of multiple LoRA heads s_iB_iH_iA_i in which only the core matrices H_i and their lightweight scaling factors s_i are trained. These trainable scaling factors let the optimization focus on the most useful heads, recovering a higher-rank approximation of the full update without increasing the number of communicated parameters since clients upload s_iH_i directly. Experiments on vision and language benchmarks show that Ravan improves test accuracy by 2-8\% over prior parameter-efficient baselines, making it a robust and scalable solution for federated fine-tuning of LLMs.

Tasks

Reproductions