SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 226–250 of 1596 papers

Title	Date	Tasks	Status	Hype
Specializing Smaller Language Models towards Multi-Step Reasoning	Jan 30, 2023	MathModel Selection	CodeCode Available	2
A Survey of Deep Learning for Mathematical Reasoning	Dec 20, 2022	Deep LearningMath	CodeCode Available	2
Multi-View Reasoning: Consistent Contrastive Learning for Math Word Problem	Oct 21, 2022	Contrastive LearningMath	CodeCode Available	2
Language Models are Multilingual Chain-of-Thought Reasoners	Oct 6, 2022	GSM8KMath	CodeCode Available	2
PaLM: Scaling Language Modeling with Pathways	Apr 5, 2022	Auto DebuggingCode Generation	CodeCode Available	2
Memorizing Transformers	Mar 16, 2022	Language ModelingLanguage Modelling	CodeCode Available	2
Accelerating Sparse Deep Neural Networks	Apr 16, 2021	GPUMath	CodeCode Available	2
Full Page Handwriting Recognition via Image to Sequence Extraction	Mar 11, 2021	Handwriting RecognitionHandwritten Text Recognition	CodeCode Available	2
Measuring Mathematical Problem Solving With the MATH Dataset	Mar 5, 2021	MathMathematical Problem-Solving	CodeCode Available	2
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination	Jul 14, 2025	MathMathematical Reasoning	CodeCode Available	1
A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning	Jul 11, 2025	MathMathematical Reasoning	CodeCode Available	1
The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains	Jul 8, 2025	MathMMLU	CodeCode Available	1
LLMThinkBench: Towards Basic Math Reasoning and Overthinking in Large Language Models	Jul 5, 2025	BenchmarkingGPU	CodeCode Available	1
Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective	Jun 22, 2025	In-Context LearningLarge Language Model	CodeCode Available	1
OJBench: A Competition Level Code Benchmark For Large Language Models	Jun 19, 2025	Math	CodeCode Available	1
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team	Jun 17, 2025	Code GenerationGSM8K	CodeCode Available	1
Steering LLM Thinking with Budget Guidance	Jun 16, 2025	Math	CodeCode Available	1
RePO: Replay-Enhanced Policy Optimization	Jun 11, 2025	MathMathematical Reasoning	CodeCode Available	1
Resa: Transparent Reasoning Models via SAEs	Jun 11, 2025	Math	CodeCode Available	1
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs	Jun 11, 2025	Code GenerationDiagnostic	CodeCode Available	1
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning	Jun 10, 2025	Knowledge DistillationMath	CodeCode Available	1
WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning	Jun 9, 2025	MathMathematical Reasoning	CodeCode Available	1
Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models	Jun 4, 2025	Math	CodeCode Available	1
STORM-BORN: A Challenging Mathematical Derivations Dataset Curated via a Human-in-the-Loop Multi-Agent Framework	Jun 2, 2025	Math	CodeCode Available	1
Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks	May 30, 2025	Autonomous DrivingMath	CodeCode Available	1

Show:10 25 50

← PrevPage 10 of 64Next →

No leaderboard results yet.