SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 291–300 of 439 papers

Title	Date	Tasks	Status	Hype
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning	Apr 18, 2025	AllGSM8K	—Unverified	0
No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function	Sep 1, 2023	GSM8KMathematical Reasoning	—Unverified	0
Nudging: Inference-time Alignment of LLMs via Guided Decoding	Oct 11, 2024	General KnowledgeGSM8K	—Unverified	0
On Designing Effective RL Reward at Training Time for LLM Reasoning	Oct 19, 2024	GSM8KMath	—Unverified	0
Making Large Language Models Better Reasoners with Step-Aware Verifier	Jun 6, 2022	Arithmetic ReasoningFew-Shot Learning	—Unverified	0
Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation	Oct 22, 2024	GSM8KMath	—Unverified	0
Orca-Math: Unlocking the potential of SLMs in Grade School Math	Feb 16, 2024	Arithmetic ReasoningGSM8K	—Unverified	0
PARAMANU-GANITA: Language Model with Mathematical Capabilities	Apr 22, 2024	Domain AdaptationGSM8K	—Unverified	0
Patience Is The Key to Large Language Model Reasoning	Nov 20, 2024	GSM8KLanguage Modeling	—Unverified	0
PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation	Oct 2, 2024	Data AugmentationDiversity	—Unverified	0

Show:10 25 50

← PrevPage 30 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified