SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 71–80 of 439 papers

Title	Date	Tasks	Status	Hype	Score
Dynamic Early Exit in Reasoning Models	Apr 22, 2025	GSM8KMath	CodeCode Available	2	5
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch	Nov 6, 2023	DecoderGSM8K	CodeCode Available	2	5
Balancing LoRA Performance and Efficiency with Simple Shard Sharing	Sep 19, 2024	Computational EfficiencyGSM8K	CodeCode Available	2	5
Offline Reinforcement Learning for LLM Multi-Step Reasoning	Dec 20, 2024	GSM8KMath	CodeCode Available	2	5
Language Models are Multilingual Chain-of-Thought Reasoners	Oct 6, 2022	GSM8KMath	CodeCode Available	2	5
Large Language Models are Zero-Shot Reasoners	May 24, 2022	Arithmetic ReasoningCommon Sense Reasoning	CodeCode Available	2	5
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process	Jul 29, 2024	GSM8KMath	CodeCode Available	2	5
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning	Oct 5, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2	5
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning	Oct 9, 2023	Arithmetic ReasoningData Augmentation	CodeCode Available	2	5
Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards	Apr 16, 2024	GSM8KMath	CodeCode Available	2	5

Show:10 25 50

← PrevPage 8 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified