SOTAVerified

Mathematical Reasoning

Papers

Showing 110 of 805 papers

TitleStatusHype
VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks0
A Survey of Deep Learning for Geometry Problem SolvingCode0
KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?0
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data ContaminationCode1
A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement LearningCode1
Integrating External Tools with Large Language Models to Improve Accuracy0
Agentic-R1: Distilled Dual-Strategy ReasoningCode0
CriticLean: Critic-Guided Reinforcement Learning for Mathematical FormalizationCode1
CoRE: Enhancing Metacognition with Label-free Self-evaluation in LRMs0
Skywork-R1V3 Technical ReportCode7
Show:102550
← PrevPage 1 of 81Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1o3Accuracy0.25Unverified
2Gemini 1.5 Pro (002)Accuracy0.02Unverified
3Claude 3.5 SonnetAccuracy0.01Unverified
4o1-previewAccuracy0.01Unverified
5o1-miniAccuracy0.01Unverified
6GPT-4oAccuracy0.01Unverified