NEFTune: Noisy Embeddings Improve Instruction Finetuning

2023-10-09Code Available6· sign in to hype

Neel Jain, Ping-Yeh Chiang, Yuxin Wen, John Kirchenbauer, Hong-Min Chu, Gowthami Somepalli, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/neelsjain/neftune
OfficialIn paperpytorch★ 412
github.com/openaccess-ai-collective/axolotl
pytorch★ 11,497
github.com/rijgersberg/geitje
pytorch★ 129
github.com/akjindal53244/arithmo
pytorch★ 73

Abstract

We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on AlpacaEval, which rises to 64.69% using noisy embeddings. NEFTune also improves over strong baselines on modern instruction datasets. Models trained with Evol-Instruct see a 10% improvement, with ShareGPT an 8% improvement, and with OpenPlatypus an 8% improvement. Even powerful models further refined with RLHF such as LLaMA-2-Chat benefit from additional training with NEFTune.

Tasks

Language Modeling Language Modelling

NEFTune: Noisy Embeddings Improve Instruction Finetuning

Code

Abstract

Tasks

Reproductions