SOTAVerified

LLaMA: Open and Efficient Foundation Language Models

2023-02-27arXiv 2023Code Available7· sign in to hype

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
GSM8KLLaMA 65B-maj1@kAccuracy69.7Unverified
GSM8KLLaMA 7BAccuracy11Unverified
GSM8KLLaMA 13BAccuracy17.8Unverified
GSM8KLLaMA 7B (maj1@k)Accuracy18.1Unverified
GSM8KLLaMA 13B-maj1@kAccuracy29.3Unverified
GSM8KLLaMA 33BAccuracy35.6Unverified
GSM8KLLaMA 65BAccuracy50.9Unverified
GSM8KLLaMA 33B-maj1@kAccuracy53.1Unverified

Reproductions