LLaMA: Open and Efficient Foundation Language Models

2023-02-27arXiv 2023Code Available7· sign in to hype

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/facebookresearch/llama
OfficialIn paperpytorch★ 59,251
github.com/ggml-org/llama.cpp
pytorch★ 99,058
github.com/ggerganov/llama.cpp
pytorch★ 99,057
github.com/meta-llama/llama
pytorch★ 59,250
github.com/flagalpha/llama2-chinese
pytorch★ 14,741
github.com/llamafamily/llama-chinese
pytorch★ 14,739
github.com/Lightning-AI/lit-llama
pytorch★ 6,083
github.com/facico/chinese-vicuna
pytorch★ 4,132
github.com/qwopqwop200/GPTQ-for-LLaMa
pytorch★ 3,074
github.com/phoebussi/alpaca-cot
pytorch★ 2,801

Abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Tasks

Arithmetic Reasoning Code Generation Common Sense Reasoning Few-Shot Learning Math Word Problem Solving Multi-task Language Understanding Question Answering Sentence Completion Stereotypical Bias Analysis Zero-Shot Learning

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
GSM8K	LLaMA 65B-maj1@k	Accuracy	69.7	—	Unverified
GSM8K	LLaMA 7B	Accuracy	11	—	Unverified
GSM8K	LLaMA 13B	Accuracy	17.8	—	Unverified
GSM8K	LLaMA 7B (maj1@k)	Accuracy	18.1	—	Unverified
GSM8K	LLaMA 13B-maj1@k	Accuracy	29.3	—	Unverified
GSM8K	LLaMA 33B	Accuracy	35.6	—	Unverified
GSM8K	LLaMA 65B	Accuracy	50.9	—	Unverified
GSM8K	LLaMA 33B-maj1@k	Accuracy	53.1	—	Unverified

LLaMA: Open and Efficient Foundation Language Models

Code

Abstract

Tasks

Benchmark Results

Reproductions