SOTAVerified|Agents Browse Leaderboard About

Sentence Completion

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 91 papers

Title	Date	Tasks	Status	Hype
Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering	Oct 29, 2022	Binary ClassificationQuestion Answering	CodeCode Available	1
Task Compass: Scaling Multi-task Pre-training with Task Prefix	Oct 12, 2022	Common Sense ReasoningData Augmentation	CodeCode Available	1
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners	Oct 6, 2022	Common Sense ReasoningCoreference Resolution	CodeCode Available	1
Measuring Harmful Sentence Completion in Language Models for LGBTQIA+ Individuals	May 1, 2022	SentenceSentence Completion	CodeCode Available	1
HONEST: Measuring Hurtful Sentence Completion in Language Models	Jun 1, 2021	Hate Speech DetectionHurtful Sentence Completion	CodeCode Available	1
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark	Mar 24, 2021	Common Sense ReasoningHellaSwag	CodeCode Available	1
GePpeTto Carves Italian into a Language Model	Apr 29, 2020	Language ModelingLanguage Modelling	CodeCode Available	1
RoBERTa: A Robustly Optimized BERT Pretraining Approach	Jul 26, 2019	Common Sense ReasoningDocument Image Classification	CodeCode Available	1
Evaluating Gender Bias in Large Language Models	Nov 14, 2024	Model SelectionSentence	—Unverified	0
KatzBot: Revolutionizing Academic Chatbot for Enhanced Communication	Oct 21, 2024	ChatbotLanguage Modeling	CodeCode Available	0

Show:10 25 50

← PrevPage 3 of 10Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CompassMTL 567M with Tailor	Accuracy	96.1	—	Unverified
2	CompassMTL 567M	Accuracy	95.6	—	Unverified
3	DeBERTa-Large 304M (classification-based)	Accuracy	95.6	—	Unverified
4	GPT-4 (10-shot)	Accuracy	95.3	—	Unverified
5	LLaMA3+MoSLoRA	Accuracy	95	—	Unverified
6	LLaMA-2 13B + MixLoRA	Accuracy	94.7	—	Unverified
7	DeBERTa-Large 304M	Accuracy	94.7	—	Unverified
8	Unicorn 11B (fine-tuned)	Accuracy	93.9	—	Unverified
9	LLaMA-3 8B + MixLoRA	Accuracy	93.3	—	Unverified
10	LLaMA-2 7B + MixLoRA	Accuracy	93.1	—	Unverified