SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 751–775 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
Leveraging Training Data in Few-Shot Prompting for Numerical Reasoning	May 29, 2023	Language ModellingLarge Language Model	CodeCode Available	0	5
Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning	Aug 7, 2023	In-Context LearningMath	CodeCode Available	0	5
Leveraging Web-Crawled Data for High-Quality Fine-Tuning	Aug 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	0	5
Can We Use Small Models to Investigate Multimodal Fusion Methods?	Sep 1, 2022	Math	CodeCode Available	0	5
Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning Process	May 10, 2024	Geometry Problem SolvingMachine Translation	CodeCode Available	0	5
Leveraging Label Semantics and Meta-Label Refinement for Multi-Label Question Classification	Nov 4, 2024	MathReranking	CodeCode Available	0	5
Can Vision-Language Models Evaluate Handwritten Math?	Jan 13, 2025	Math	CodeCode Available	0	5
AI-Assisted Generation of Difficult Math Questions	Jul 30, 2024	MathMathematical Reasoning	CodeCode Available	0	5
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning	Feb 24, 2025	MathMathematical Reasoning	CodeCode Available	0	5
MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark	Aug 14, 2024	MathMathematical Reasoning	CodeCode Available	0	5
OntoMath^PRO Ontology: A Linked Data Hub for Mathematics	Jul 17, 2014	Math	CodeCode Available	0	5
Examining the Robustness of Large Language Models across Language Complexity	Jan 30, 2025	Math	—Unverified	0	0
Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil	Aug 9, 2024	MathMultiple-choice	—Unverified	0	0
Can Stories Help LLMs Reason? Curating Information Space Through Narrative	Oct 25, 2024	Math	—Unverified	0	0
Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization	Feb 8, 2025	GSM8KMath	—Unverified	0	0
Can LLMs understand Math? -- Exploring the Pitfalls in Mathematical Reasoning	May 21, 2025	MathMathematical Reasoning	—Unverified	0	0
A range characterization of the single-quadrant ADRT	Oct 11, 2020	Math	—Unverified	0	0
EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages	Feb 12, 2024	Automated Theorem ProvingBenchmarking	—Unverified	0	0
AI4Math: A Native Spanish Benchmark for University-Level Mathematical Reasoning in Large Language Models	May 25, 2025	MathMathematical Reasoning	—Unverified	0	0
Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning	Dec 23, 2024	Math	—Unverified	0	0
Evaluating Robustness of Reward Models for Mathematical Reasoning	Oct 2, 2024	MathMathematical Reasoning	—Unverified	0	0
Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation	May 29, 2025	GSM8KMath	—Unverified	0	0
Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics	Apr 24, 2025	Code GenerationMath	—Unverified	0	0
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams	Nov 7, 2024	Math	—Unverified	0	0
A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions	Dec 12, 2024	GSM8KKnowledge Graphs	—Unverified	0	0

Show:10 25 50

← PrevPage 31 of 64Next →

No leaderboard results yet.