SOTAVerified|Agents Browse Leaderboard About Blog

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 76–100 of 1596 papers

Title	Date	Tasks	Status	Hype
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding	Apr 25, 2024	GSM8KHellaSwag	CodeCode Available	3
Spurious Rewards: Rethinking Training Signals in RLVR	Jun 12, 2025	MathMathematical Reasoning	CodeCode Available	3
Self-Discover: Large Language Models Self-Compose Reasoning Structures	Feb 6, 2024	Math	CodeCode Available	3
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling	Jul 31, 2024	GSM8KMath	CodeCode Available	3
Llemma: An Open Language Model For Mathematics	Oct 16, 2023	Arithmetic ReasoningAutomated Theorem Proving	CodeCode Available	3
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs	Jun 26, 2024	Arithmetic ReasoningGSM8K	CodeCode Available	3
Training Verifiers to Solve Math Word Problems	Oct 27, 2021	GSM8KMath	CodeCode Available	3
Reinforcement Learning for Reasoning in Large Language Models with One Training Example	Apr 29, 2025	Domain GeneralizationMath	CodeCode Available	3
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling	Feb 10, 2025	Math	CodeCode Available	3
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving	Feb 11, 2025	Automated Theorem ProvingLarge Language Model	CodeCode Available	3
General-Reasoner: Advancing LLM Reasoning Across All Domains	May 20, 2025	AllMath	CodeCode Available	3
Rho-1: Not All Tokens Are What You Need	Apr 11, 2024	AllContinual Pretraining	CodeCode Available	3
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks	Nov 22, 2022	Math	CodeCode Available	3
PAL: Program-aided Language Models	Nov 18, 2022	Arithmetic ReasoningGSM8K	CodeCode Available	3
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation	Jan 9, 2024	GPUMath	CodeCode Available	3
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning	May 13, 2024	Data AugmentationGSM8K	CodeCode Available	3
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning	May 1, 2024	ARCGSM8K	CodeCode Available	3
Noise Contrastive Alignment of Language Models with Explicit Rewards	Feb 8, 2024	Language ModellingMath	CodeCode Available	3
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition	Oct 9, 2023	Code GenerationInstruction Following	CodeCode Available	3
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning	May 15, 2025	cross-modal alignmentGeometry Problem Solving	CodeCode Available	3
MathArena: Evaluating LLMs on Uncontaminated Math Competitions	May 29, 2025	MathMathematical Reasoning	CodeCode Available	3
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models	Apr 3, 2024	GPUMath	CodeCode Available	3
Scaling up Masked Diffusion Models on Text	Oct 24, 2024	GSM8KLanguage Modeling	CodeCode Available	3
Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory	Apr 10, 2025	MathMMLU	CodeCode Available	3
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning	Jun 13, 2024	Instruction FollowingMath	CodeCode Available	3

Show:10 25 50

← PrevPage 4 of 64Next →

No leaderboard results yet.