SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–210 of 439 papers

Title	Date	Tasks	Status	Hype
Let's reward step by step: Step-Level reward model as the Navigators for Reasoning	Oct 16, 2023	Code GenerationGSM8K	—Unverified	0
Meaning-Typed Programming: Language Abstraction and Runtime for Model-Integrated Applications	May 14, 2024	GSM8KMath	—Unverified	0
Memory-Efficient LLM Training by Various-Grained Low-Rank Projection of Gradients	May 3, 2025	GSM8KMMLU	—Unverified	0
DavIR: Data Selection via Implicit Reward for Large Language Models	Oct 16, 2023	Causal Language ModelingGSM8K	—Unverified	0
Model Unlearning via Sparse Autoencoder Subspace Guided Projections	May 30, 2025	Adversarial Robustnessfeature selection	—Unverified	0
Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems	Aug 29, 2024	GSM8KLanguage Modeling	—Unverified	0
Elastic Weight Consolidation for Full-Parameter Continual Pre-Training of Gemma2	May 9, 2025	ARCBelebele	—Unverified	0
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models	Feb 24, 2024	GSM8KMathematical Reasoning	—Unverified	0
Let's Reinforce Step by Step	Nov 10, 2023	GSM8KLogical Reasoning	—Unverified	0
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint	Feb 24, 2025	GSM8K	—Unverified	0

Show:10 25 50

← PrevPage 21 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified