SOTAVerified|Agents Browse Leaderboard About Blog

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine	Jul 11, 2024	Contrastive LearningLanguage Modelling	CodeCode Available	4	5
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models	Jun 9, 2022	Common Sense ReasoningMath	CodeCode Available	4	5
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision	May 19, 2025	MathMathematical Reasoning	CodeCode Available	4	5
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights	Oct 11, 2024	GSM8KMath	CodeCode Available	4	5
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond	Mar 13, 2025	Domain GeneralizationMath	CodeCode Available	4	5
Let's Verify Step by Step	May 31, 2023	Active LearningMath	CodeCode Available	4	5
LLaMA Pro: Progressive LLaMA with Block Expansion	Jan 4, 2024	Instruction FollowingMath	CodeCode Available	4	5
Skywork Open Reasoner 1 Technical Report	May 28, 2025	MathReinforcement Learning (RL)	CodeCode Available	4	5
ReFT: Reasoning with Reinforced Fine-Tuning	Jan 17, 2024	GSM8KMath	CodeCode Available	4	5
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning	Feb 9, 2024	Data AugmentationGSM8K	CodeCode Available	4	5
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4	5
Dive into Deep Learning	Jun 21, 2021	Deep LearningMath	CodeCode Available	4	5
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction	Feb 11, 2025	Code GenerationMath	CodeCode Available	4	5
How is ChatGPT's behavior changing over time?	Jul 18, 2023	Code GenerationLanguage Modelling	CodeCode Available	4	5
Lean Workbook: A large-scale Lean problem set formalized from natural language math problems	Jun 6, 2024	Automated Theorem ProvingMath	CodeCode Available	4	5
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving	Feb 11, 2025	Automated Theorem ProvingLarge Language Model	CodeCode Available	3	5
General-Reasoner: Advancing LLM Reasoning Across All Domains	May 20, 2025	AllMath	CodeCode Available	3	5
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks	Nov 22, 2022	Math	CodeCode Available	3	5
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities	Aug 1, 2024	MathMM-Vet	CodeCode Available	3	5
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling	Feb 10, 2025	Math	CodeCode Available	3	5
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning	May 13, 2024	Data AugmentationGSM8K	CodeCode Available	3	5
Noise Contrastive Alignment of Language Models with Explicit Rewards	Feb 8, 2024	Language ModellingMath	CodeCode Available	3	5
PAL: Program-aided Language Models	Nov 18, 2022	Arithmetic ReasoningGSM8K	CodeCode Available	3	5
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning	Jun 13, 2024	Instruction FollowingMath	CodeCode Available	3	5
MathArena: Evaluating LLMs on Uncontaminated Math Competitions	May 29, 2025	MathMathematical Reasoning	CodeCode Available	3	5

Show:10 25 50

← PrevPage 3 of 64Next →

No leaderboard results yet.