SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 771–780 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
Evaluating Robustness of Reward Models for Mathematical Reasoning	Oct 2, 2024	MathMathematical Reasoning	—Unverified	0	0
Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation	May 29, 2025	GSM8KMath	—Unverified	0	0
Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics	Apr 24, 2025	Code GenerationMath	—Unverified	0	0
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams	Nov 7, 2024	Math	—Unverified	0	0
A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions	Dec 12, 2024	GSM8KKnowledge Graphs	—Unverified	0	0
A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science	Mar 21, 2024	Active LearningMath	—Unverified	0	0
Can I understand what I create? Self-Knowledge Evaluation of Large Language Models	Jun 10, 2024	Math	—Unverified	0	0
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate	May 22, 2023	BenchmarkingMath	—Unverified	0	0
Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework	Jan 26, 2025	MathMathematical Reasoning	—Unverified	0	0
A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio	Sep 10, 2024	Emotional IntelligenceMath	—Unverified	0	0

Show:10 25 50

← PrevPage 78 of 160Next →

No leaderboard results yet.