SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–260 of 439 papers

Title	Date	Tasks	Status	Hype
Iterative Reasoning Preference Optimization	Apr 30, 2024	ARCGSM8K	—Unverified	0
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning	Mar 4, 2024	GSM8KMath	—Unverified	0
KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?	Jul 15, 2025	GSM8KLanguage Modeling	—Unverified	0
Kwai-STaR: Transform LLMs into State-Transition Reasoners	Nov 7, 2024	GSM8KMathematical Problem-Solving	—Unverified	0
KwaiYiiMath: Technical Report	Oct 11, 2023	Arithmetic ReasoningGSM8K	—Unverified	0
Large Language Models as Analogical Reasoners	Oct 3, 2023	Code GenerationGSM8K	—Unverified	0
Large Language Models Can Self-Improve	Oct 20, 2022	Arithmetic ReasoningCommon Sense Reasoning	—Unverified	0
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge	Feb 27, 2025	GSM8KHumanEval	—Unverified	0
LearnAlign: Reasoning Data Selection for Reinforcement Learning in Large Language Models Based on Improved Gradient Alignment	Jun 13, 2025	GSM8KMathematical Reasoning	—Unverified	0
Learning to Rank Chain-of-Thought: An Energy-Based Approach with Outcome Supervision	May 21, 2025	GSM8KLearning-To-Rank	—Unverified	0

Show:10 25 50

← PrevPage 26 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified