SOTAVerified|Agents Browse Leaderboard About Blog

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 1596 papers

Title	Date	Tasks	Status	Hype
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning	May 13, 2024	Data AugmentationGSM8K	CodeCode Available	3
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning	Jun 13, 2024	Instruction FollowingMath	CodeCode Available	3
Llemma: An Open Language Model For Mathematics	Oct 16, 2023	Arithmetic ReasoningAutomated Theorem Proving	CodeCode Available	3
Thinkless: LLM Learns When to Think	May 19, 2025	GSM8KMath	CodeCode Available	3
Step-level Value Preference Optimization for Mathematical Reasoning	Jun 16, 2024	Learning-To-RankMath	CodeCode Available	3
Dynamic Early Exit in Reasoning Models	Apr 22, 2025	GSM8KMath	CodeCode Available	2
Memorizing Transformers	Mar 16, 2022	Language ModelingLanguage Modelling	CodeCode Available	2
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset	Feb 22, 2024	DiversityMath	CodeCode Available	2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts	Oct 3, 2023	ChatbotImage Captioning	CodeCode Available	2
MegaMath: Pushing the Limits of Open Math Corpora	Apr 3, 2025	DiversityMath	CodeCode Available	2
Meta-Design Matters: A Self-Design Multi-Agent System	May 21, 2025	MathProblem Decomposition	CodeCode Available	2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning	Oct 5, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2
MAS-Zero: Designing Multi-Agent Systems with Zero Supervision	May 26, 2025	MathProblem Decomposition	CodeCode Available	2
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark	May 20, 2024	College MathematicsGSM8K	CodeCode Available	2
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models	Jun 13, 2024	MathQuantization	CodeCode Available	2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models	Jun 25, 2024	DiversityMath	CodeCode Available	2
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code	Oct 10, 2024	MathMathematical Reasoning	CodeCode Available	2
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models	Sep 21, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models	Apr 13, 2023	Decision MakingMath	CodeCode Available	2
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data	Jun 26, 2024	BenchmarkingMath	CodeCode Available	2
Cumulative Reasoning with Large Language Models	Aug 8, 2023	Decision MakingLogical Reasoning	CodeCode Available	2
Measuring Mathematical Problem Solving With the MATH Dataset	Mar 5, 2021	MathMathematical Problem-Solving	CodeCode Available	2
Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving	May 12, 2025	MathMathematical Problem-Solving	CodeCode Available	2
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving	Jun 18, 2024	Arithmetic ReasoningMath	CodeCode Available	2
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate	Jan 29, 2025	Instruction FollowingMath	CodeCode Available	2

Show:10 25 50

← PrevPage 5 of 64Next →

No leaderboard results yet.