Mixture-of-Subspaces in Low-Rank Adaptation

2024-06-16Code Available0· sign in to hype

Taiqiang Wu, Jiahao Wang, Zhe Zhao, Ngai Wong

Code Available — Be the first to reproduce this paper.

Code

github.com/wutaiqiang/moslora
OfficialIn paperpytorch★ 0

Abstract

In this paper, we introduce a subspace-inspired Low-Rank Adaptation (LoRA) method, which is computationally efficient, easy to implement, and readily applicable to large language, multimodal, and diffusion models. Initially, we equivalently decompose the weights of LoRA into two subspaces, and find that simply mixing them can enhance performance. To study such a phenomenon, we revisit it through a fine-grained subspace lens, showing that such modification is equivalent to employing a fixed mixer to fuse the subspaces. To be more flexible, we jointly learn the mixer with the original LoRA weights, and term the method Mixture-of-Subspaces LoRA (MoSLoRA). MoSLoRA consistently outperforms LoRA on tasks in different modalities, including commonsense reasoning, visual instruction tuning, and subject-driven text-to-image generation, demonstrating its effectiveness and robustness. Codes are available at https://github.com/wutaiqiang/MoSLoRA.

Tasks

Common Sense Reasoning Image Generation Question Answering Sentence Completion Text to Image Generation Text-to-Image Generation Visual Question Answering

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
arc_challenge	LLaMA 3 8B + MoSLoRA (fine-tuned)	Accuracy	81.5	—	Unverified
arc_easy	LLaMA 3 8B+MoSLoRA (fine-tuned)	Accuracy	90.5	—	Unverified
WinoGrande	LLaMA3 8B+MoSLoRA	Accuracy	85.8	—	Unverified

Mixture-of-Subspaces in Low-Rank Adaptation

Code

Abstract

Tasks

Benchmark Results

Reproductions