SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 221–230 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
Can AI Assistants Know What They Don't Know?	Jan 24, 2024	MathOpen-Domain Question Answering	CodeCode Available	2	5
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model	Oct 17, 2024	Math	CodeCode Available	2	5
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters	May 27, 2024	BenchmarkingGSM8K	CodeCode Available	2	5
Agent Lumos: Unified and Modular Training for Open-Source Language Agents	Nov 9, 2023	MathQuestion Answering	CodeCode Available	2	5
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark	May 20, 2024	College MathematicsGSM8K	CodeCode Available	2	5
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training	Nov 24, 2024	MathMixture-of-Experts	CodeCode Available	2	5
Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning	May 5, 2024	GSM8KMath	CodeCode Available	2	5
Archon: An Architecture Search Framework for Inference-Time Techniques	Sep 23, 2024	Hyperparameter OptimizationInstruction Following	CodeCode Available	2	5
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions	Jun 10, 2025	Math	CodeCode Available	2	5
Evaluating Mathematical Reasoning Beyond Accuracy	Apr 8, 2024	MathMathematical Reasoning	CodeCode Available	2	5

Show:10 25 50

← PrevPage 23 of 160Next →

No leaderboard results yet.