SOTAVerified|Agents Browse Leaderboard About

Winogrande

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 26 papers

Title	Date	Tasks	Status	Hype
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning	Mar 26, 2024	GPUGSM8K	CodeCode Available	9
ST-MoE: Designing Stable and Transferable Sparse Expert Models	Feb 17, 2022	ARCCommon Sense Reasoning	CodeCode Available	3
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding	Apr 25, 2024	GSM8KHellaSwag	CodeCode Available	3
Scaling Language Models: Methods, Analysis & Insights from Training Gopher	Dec 8, 2021	Abstract AlgebraAnachronisms	CodeCode Available	2
Bridging the Gap: Enhancing LLM Performance for Low-Resource African Languages with New Benchmarks, Fine-Tuning, and Cultural Adjustments	Dec 16, 2024	Clinical KnowledgeCollege Medicine	CodeCode Available	1
WinoGrande: An Adversarial Winograd Schema Challenge at Scale	Jul 24, 2019	Common Sense ReasoningCoreference Resolution	CodeCode Available	1
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark	Mar 24, 2021	Common Sense ReasoningHellaSwag	CodeCode Available	1
Generative Data Augmentation for Commonsense Reasoning	Apr 24, 2020	Common Sense ReasoningCoreference Resolution	CodeCode Available	1
PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches	Oct 8, 2024	GPUGSM8K	—Unverified	0
Promises, Outlooks and Challenges of Diffusion Language Modeling	Jun 17, 2024	ARCHellaSwag	—Unverified	0
WinoWhat: A Parallel Corpus of Paraphrased WinoGrande Sentences with Common Sense Categorization	Mar 31, 2025	Common Sense ReasoningMemorization	—Unverified	0
A Warm Start and a Clean Crawled Corpus -- A Recipe for Good Language Models	Jan 14, 2022	Constituency ParsingGrammatical Error Detection	—Unverified	0
A Warm Start and a Clean Crawled Corpus - A Recipe for Good Language Models	Jun 1, 2022	Constituency ParsingGrammatical Error Detection	—Unverified	0
Elastic Weight Consolidation for Full-Parameter Continual Pre-Training of Gemma2	May 9, 2025	ARCBelebele	—Unverified	0
Judgment of Thoughts: Courtroom of the Binary Logical Reasoning in Large Language Models	Sep 25, 2024	Fake News DetectionLanguage Modeling	—Unverified	0
More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment	Apr 3, 2025	ARCHellaSwag	—Unverified	0
Not-so fine-tuning: Measures of Common Sense for Language Models	Sep 29, 2021	Common Sense ReasoningGPU	—Unverified	0
Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models	Feb 20, 2025	HellaSwagMemorization	—Unverified	0
TTTTTackling WinoGrande Schemas	Mar 18, 2020	Coreference ResolutionWinogrande	—Unverified	0
Unsupervised Pronoun Resolution via Masked Noun-Phrase Prediction	May 26, 2021	PredictionWinogrande	—Unverified	0
Who's Harry Potter? Approximate Unlearning in LLMs	Oct 3, 2023	ARCGPU	—Unverified	0
An Application of Pseudo-Log-Likelihoods to Natural Language Scoring	Jan 23, 2022	Common Sense ReasoningGPU	—Unverified	0
On Curriculum Learning for Commonsense Reasoning	Jul 1, 2022	HellaSwagLearning-To-Rank	CodeCode Available	0
metabench -- A Sparse Benchmark to Measure General Ability in Large Language Models	Jul 4, 2024	ARCGSM8K	CodeCode Available	0
Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations	Nov 14, 2022	Winogrande	CodeCode Available	0

Show:10 25 50

← PrevPage 1 of 2Next →

No leaderboard results yet.