SOTAVerified

GSM8K

Papers

Showing 7180 of 439 papers

TitleStatusHype
Dynamic Early Exit in Reasoning ModelsCode2
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free LunchCode2
Balancing LoRA Performance and Efficiency with Simple Shard SharingCode2
Offline Reinforcement Learning for LLM Multi-Step ReasoningCode2
Language Models are Multilingual Chain-of-Thought ReasonersCode2
Large Language Models are Zero-Shot ReasonersCode2
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning ProcessCode2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical ReasoningCode2
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math ReasoningCode2
Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained RewardsCode2
Show:102550
← PrevPage 8 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified