SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 291–300 of 1596 papers

Title	Date	Tasks	Status	Hype
DebFlow: Automating Agent Creation via Agent Debate	Mar 31, 2025	Math	—Unverified	0
ToRL: Scaling Tool-Integrated RL	Mar 30, 2025	Mathreinforcement-learning	CodeCode Available	3
Learning to Reason for Long-Form Story Generation	Mar 28, 2025	FormMath	CodeCode Available	2
QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?	Mar 28, 2025	Logical ReasoningMath	CodeCode Available	1
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models	Mar 28, 2025	GPUGSM8K	CodeCode Available	2
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models	Mar 27, 2025	Math	CodeCode Available	1
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad	Mar 27, 2025	MathMathematical Reasoning	—Unverified	0
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models	Mar 27, 2025	Data VisualizationMath	CodeCode Available	0
Effective Skill Unlearning through Intervention and Abstention	Mar 27, 2025	General KnowledgeMath	CodeCode Available	0
Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators	Mar 25, 2025	Math	—Unverified	0

Show:10 25 50

← PrevPage 30 of 160Next →

No leaderboard results yet.