SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 361–370 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
HARP: A challenging human-annotated math reasoning benchmark	Dec 11, 2024	Math	CodeCode Available	1	5
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning	Sep 19, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	1	5
MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports	May 16, 2025	DiagnosticMath	CodeCode Available	1	5
HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM Systems	May 17, 2025	Arithmetic ReasoningCode Generation	CodeCode Available	1	5
HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics	Oct 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models	Apr 8, 2025	MathMultimodal Reasoning	CodeCode Available	1	5
Measuring Conversational Uptake: A Case Study on Student-Teacher Interactions	Jun 7, 2021	MathQuestion Answering	CodeCode Available	1	5
MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations	Feb 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
Math Word Problem Solving with Explicit Numerical Values	Aug 1, 2021	MathMath Word Problem Solving	CodeCode Available	1	5
Entropy-Based Adaptive Weighting for Self-Training	Mar 31, 2025	GSM8KMath	CodeCode Available	1	5

Show:10 25 50

← PrevPage 37 of 160Next →

No leaderboard results yet.