SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 1596 papers

Title	Date	Tasks	Status	Hype
Qwen Technical Report	Sep 28, 2023	Language ModelingLanguage Modelling	CodeCode Available	6
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration	Jun 1, 2023	Autonomous DrivingCloud Computing	CodeCode Available	6
GPT-4 Technical Report	Mar 15, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	Jan 28, 2022	Common Sense ReasoningGSM8K	CodeCode Available	6
Reinforcement Learning from Human Feedback	Apr 16, 2025	MathPhilosophy	CodeCode Available	5
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models	Mar 9, 2025	MathMultimodal Reasoning	CodeCode Available	5
LIMO: Less is More for Reasoning	Feb 5, 2025	MathMathematical Reasoning	CodeCode Available	5
Process Reinforcement through Implicit Rewards	Feb 3, 2025	MathReinforcement Learning (RL)	CodeCode Available	5
Free Process Rewards without Process Labels	Dec 2, 2024	Math	CodeCode Available	5
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models	Oct 12, 2024	Mathreinforcement-learning	CodeCode Available	5
LiveBench: A Challenging, Contamination-Limited LLM Benchmark	Jun 27, 2024	ArticlesInstruction Following	CodeCode Available	5
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B	Jun 11, 2024	Decision MakingGSM8K	CodeCode Available	5
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit	Apr 22, 2024	Math	CodeCode Available	5
Evolutionary Optimization of Model Merging Recipes	Mar 19, 2024	Evolutionary AlgorithmsMath	CodeCode Available	5
Common 7B Language Models Already Possess Strong Math Capabilities	Mar 7, 2024	GSM8KMath	CodeCode Available	5
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct	Aug 18, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	5
Energy-Based Transformers are Scalable Learners and Thinkers	Jul 2, 2025	DenoisingImage Denoising	CodeCode Available	4
Skywork Open Reasoner 1 Technical Report	May 28, 2025	MathReinforcement Learning (RL)	CodeCode Available	4
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision	May 19, 2025	MathMathematical Reasoning	CodeCode Available	4
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset	Apr 23, 2025	MathMathematical Reasoning	CodeCode Available	4
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond	Mar 13, 2025	Domain GeneralizationMath	CodeCode Available	4
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction	Feb 11, 2025	Code GenerationMath	CodeCode Available	4
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates	Feb 10, 2025	Hierarchical Reinforcement LearningLanguage Modeling	CodeCode Available	4
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights	Oct 11, 2024	GSM8KMath	CodeCode Available	4

Show:10 25 50

← PrevPage 2 of 64Next →

No leaderboard results yet.