SOTAVerified|Agents Browse Leaderboard About Blog

Sentence Completion

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 91 papers

Title	Date	Tasks	Status	Hype	Score
Llama 2: Open Foundation and Fine-Tuned Chat Models	Jul 18, 2023	Arithmetic Reasoning	CodeCode Available	8	5
LLaMA: Open and Efficient Foundation Language Models	Feb 27, 2023	Arithmetic ReasoningCode Generation	CodeCode Available	7	5
Training Compute-Optimal Large Language Models	Mar 29, 2022	AnachronismsAnalogical Similarity	CodeCode Available	6	5
Mistral 7B	Oct 10, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6	5
GPT-4 Technical Report	Mar 15, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6	5
Mamba: Linear-Time Sequence Modeling with Selective State Spaces	Dec 1, 2023	2D Pose EstimationCommon Sense Reasoning	CodeCode Available	6	5
Factuality Enhanced Language Models for Open-Ended Text Generation	Jun 9, 2022	MisconceptionsSentence	CodeCode Available	5	5
Language Models are Few-Shot Learners	May 28, 2020	answerability predictionArticles	CodeCode Available	3	5
Finetuned Language Models Are Zero-Shot Learners	Sep 3, 2021	ARCCommon Sense Reasoning	CodeCode Available	3	5
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts	Apr 22, 2024	Common Sense ReasoningGPU	CodeCode Available	3	5

Show:10 25 50

← PrevPage 1 of 10Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CompassMTL 567M with Tailor	Accuracy	96.1	—	Unverified
2	CompassMTL 567M	Accuracy	95.6	—	Unverified
3	DeBERTa-Large 304M (classification-based)	Accuracy	95.6	—	Unverified
4	GPT-4 (10-shot)	Accuracy	95.3	—	Unverified
5	LLaMA3+MoSLoRA	Accuracy	95	—	Unverified
6	LLaMA-2 13B + MixLoRA	Accuracy	94.7	—	Unverified
7	DeBERTa-Large 304M	Accuracy	94.7	—	Unverified
8	Unicorn 11B (fine-tuned)	Accuracy	93.9	—	Unverified
9	LLaMA-3 8B + MixLoRA	Accuracy	93.3	—	Unverified
10	LLaMA-2 7B + MixLoRA	Accuracy	93.1	—	Unverified