SOTAVerified|Agents Browse Leaderboard About Blog

Sentence Completion

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 91 papers

Title	Date	Tasks	Status	Hype
Llama 2: Open Foundation and Fine-Tuned Chat Models	Jul 18, 2023	Arithmetic Reasoning	CodeCode Available	8
Stay on topic with Classifier-Free Guidance	Jun 30, 2023	Code GenerationCommon Sense Reasoning	—Unverified	0
ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning	May 30, 2023	BenchmarkingIn-Context Learning	CodeCode Available	0
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning	May 23, 2023	Common Sense ReasoningCommon Sense Reasoning (Zero-Shot)	CodeCode Available	2
PaLM 2 Technical Report	May 17, 2023	Code GenerationCommon Sense Reasoning	—Unverified	0
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions	Apr 27, 2023	Common Sense ReasoningCoreference Resolution	CodeCode Available	2
BloombergGPT: A Large Language Model for Finance	Mar 30, 2023	Causal JudgmentCommon Sense Reasoning	—Unverified	0
GPT-4 Technical Report	Mar 15, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6
LLaMA: Open and Efficient Foundation Language Models	Feb 27, 2023	Arithmetic ReasoningCode Generation	CodeCode Available	7
Exploring the Benefits of Training Expert Language Models over Instruction Tuning	Feb 7, 2023	Common Sense ReasoningCoreference Resolution	CodeCode Available	1

Show:10 25 50

← PrevPage 3 of 10Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CompassMTL 567M with Tailor	Accuracy	96.1	—	Unverified
2	CompassMTL 567M	Accuracy	95.6	—	Unverified
3	DeBERTa-Large 304M (classification-based)	Accuracy	95.6	—	Unverified
4	GPT-4 (10-shot)	Accuracy	95.3	—	Unverified
5	LLaMA3+MoSLoRA	Accuracy	95	—	Unverified
6	LLaMA-2 13B + MixLoRA	Accuracy	94.7	—	Unverified
7	DeBERTa-Large 304M	Accuracy	94.7	—	Unverified
8	Unicorn 11B (fine-tuned)	Accuracy	93.9	—	Unverified
9	LLaMA-3 8B + MixLoRA	Accuracy	93.3	—	Unverified
10	LLaMA-2 7B + MixLoRA	Accuracy	93.1	—	Unverified