SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 781–790 of 1596 papers

Title	Date	Tasks	Status	Hype
CRANE: Reasoning with constrained LLM generation	Feb 13, 2025	Code GenerationMath	—Unverified	0
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency	Feb 13, 2025	BenchmarkingMath	—Unverified	0
Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges	Feb 12, 2025	GSM8KMath	CodeCode Available	0
Interactive Sketchpad: A Multimodal Tutoring System for Collaborative, Visual Problem-Solving	Feb 12, 2025	Mathmultimodal interaction	—Unverified	0
O1 Embedder: Let Retrievers Think Before Action	Feb 11, 2025	Contrastive LearningMath	—Unverified	0
Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning	Feb 11, 2025	Code GenerationMath	CodeCode Available	0
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations	Feb 10, 2025	BenchmarkingIn-Context Learning	—Unverified	0
Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization	Feb 8, 2025	GSM8KMath	—Unverified	0
BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation	Feb 6, 2025	In-Context LearningKnowledge Distillation	—Unverified	0
Entropy Adaptive Decoding: Dynamic Model Switching for Efficient Inference	Feb 5, 2025	Computational EfficiencyLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 79 of 160Next →

No leaderboard results yet.