WizardCoder: Empowering Code Large Language Models with Evol-Instruct

2023-06-14Code Available5· sign in to hype

Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, QIngwei Lin, Daxin Jiang

Code Available — Be the first to reproduce this paper.

Code

github.com/nlpxucan/wizardlm
OfficialIn paperpytorch★ 9,478
github.com/nickrosh/evol-teacher
pytorch★ 166
github.com/kyle-lyu/codeact
pytorch★ 33
github.com/kyle-lyu/data-efficient-finetuning
pytorch★ 33

Abstract

Code Large Language Models (Code LLMs), such as StarCoder, have demonstrated exceptional performance in code-related tasks. However, most existing models are solely pre-trained on extensive raw code data without instruction fine-tuning. In this paper, we introduce WizardCoder, which empowers Code LLMs with complex instruction fine-tuning, by adapting the Evol-Instruct method to the domain of code. Through comprehensive experiments on four prominent code generation benchmarks, namely HumanEval, HumanEval+, MBPP, and DS-1000, we unveil the exceptional capabilities of our model. It surpasses all other open-source Code LLMs by a substantial margin. Moreover, our model even outperforms the largest closed LLMs, Anthropic's Claude and Google's Bard, on HumanEval and HumanEval+. Our code, model weights, and data are public at https://github.com/nlpxucan/WizardLM

Tasks

Code Generation HumanEval mbpp

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
CodeContests	WizardCoder-15B	Test Set pass@1	1.11	—	Unverified
MBPP	WizardCoder 15B	Accuracy	51.8	—	Unverified

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Code

Abstract

Tasks

Benchmark Results

Reproductions