LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/facebookresearch/llamaOfficialIn paperpytorch★ 59,251
- github.com/huggingface/transformerspytorch★ 158,292
- github.com/ggml-org/llama.cpppytorch★ 99,058
- github.com/ggerganov/llama.cpppytorch★ 99,057
- github.com/meta-llama/llamapytorch★ 59,250
- github.com/tatsu-lab/stanford_alpacapytorch★ 30,255
- github.com/flagalpha/llama2-chinesepytorch★ 14,741
- github.com/llamafamily/llama-chinesepytorch★ 14,739
- github.com/Lightning-AI/lit-llamapytorch★ 6,083
- github.com/facico/chinese-vicunapytorch★ 4,132
Abstract
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| GSM8K | LLaMA 65B-maj1@k | Accuracy | 69.7 | — | Unverified |
| GSM8K | LLaMA 7B | Accuracy | 11 | — | Unverified |
| GSM8K | LLaMA 13B | Accuracy | 17.8 | — | Unverified |
| GSM8K | LLaMA 7B (maj1@k) | Accuracy | 18.1 | — | Unverified |
| GSM8K | LLaMA 13B-maj1@k | Accuracy | 29.3 | — | Unverified |
| GSM8K | LLaMA 33B | Accuracy | 35.6 | — | Unverified |
| GSM8K | LLaMA 65B | Accuracy | 50.9 | — | Unverified |
| GSM8K | LLaMA 33B-maj1@k | Accuracy | 53.1 | — | Unverified |