SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–360 of 1596 papers

Title	Date	Tasks	Status	Hype
Cats Confuse Reasoning LLM: Query Agnostic Adversarial Triggers for Reasoning Models	Mar 3, 2025	Math	—Unverified	0
MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts	Feb 28, 2025	MathMathematical Reasoning	—Unverified	0
MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training	Feb 28, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving	Feb 27, 2025	GSM8KMath	CodeCode Available	1
Self-Training Elicits Concise Reasoning in Large Language Models	Feb 27, 2025	GSM8KIn-Context Learning	CodeCode Available	1
Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning	Feb 27, 2025	MathMedical Question Answering	—Unverified	0
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?	Feb 26, 2025	Math	CodeCode Available	1
Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation	Feb 26, 2025	Code GenerationHumanEval	CodeCode Available	2
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning	Feb 25, 2025	MathMathematical Reasoning	—Unverified	0
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution	Feb 25, 2025	MathReinforcement Learning (RL)	—Unverified	0

Show:10 25 50

← PrevPage 36 of 160Next →

No leaderboard results yet.