SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 361–370 of 439 papers

Title	Date	Tasks	Status	Hype	Score
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs	Dec 11, 2024	ARCGSM8K	—Unverified	0	0
SOLAR: Scalable Optimization of Large-scale Architecture for Reasoning	Mar 6, 2025	GSM8KMath	—Unverified	0	0
Complexity-Based Prompting for Multi-Step Reasoning	Oct 3, 2022	Date UnderstandingGSM8K	—Unverified	0	0
Solving math word problems with process- and outcome-based feedback	Nov 25, 2022	Arithmetic ReasoningGSM8K	—Unverified	0	0
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning	Oct 3, 2024	GSM8KLanguage Modeling	—Unverified	0	0
Weaker LLMs' Opinions Also Matter: Mixture of Opinions Enhances LLM's Mathematical Reasoning	Feb 26, 2025	GSM8KMathematical Reasoning	—Unverified	0	0
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths	May 30, 2024	GSM8KHumanEval	—Unverified	0	0
Steering LLM Reasoning Through Bias-Only Adaptation	May 24, 2025	GSM8KMath	—Unverified	0	0
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference	Feb 1, 2025	GPUGSM8K	—Unverified	0	0
Can Separators Improve Chain-of-Thought Prompting?	Feb 16, 2024	8kGSM8K	—Unverified	0	0

Show:10 25 50

← PrevPage 37 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified