SOTAVerified

Math

Papers

Showing 361370 of 1596 papers

TitleStatusHype
HARP: A challenging human-annotated math reasoning benchmarkCode1
Natural Language Embedded Programs for Hybrid Language Symbolic ReasoningCode1
MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reportsCode1
HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM SystemsCode1
HARDMath: A Benchmark Dataset for Challenging Problems in Applied MathematicsCode1
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language ModelsCode1
Measuring Conversational Uptake: A Case Study on Student-Teacher InteractionsCode1
MATHWELL: Generating Educational Math Word Problems Using Teacher AnnotationsCode1
Math Word Problem Solving with Explicit Numerical ValuesCode1
Entropy-Based Adaptive Weighting for Self-TrainingCode1
Show:102550
← PrevPage 37 of 160Next →

No leaderboard results yet.