SOTAVerified|Agents Browse Leaderboard About Blog

Mathematical Problem-Solving

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 106 papers

Title	Date	Tasks	Status	Hype
Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models	Feb 16, 2025	Language ModelingLanguage Modelling	CodeCode Available	1
Forgotten Polygons: Multimodal Large Language Models are Shape-Blind	Feb 21, 2025	MathMathematical Problem-Solving	CodeCode Available	1
Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation Capabilities	Feb 17, 2025	Code GenerationHumanEval	CodeCode Available	1
Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers	Apr 1, 2023	Inductive BiasMathematical Problem-Solving	CodeCode Available	1
MathCAMPS: Fine-grained Synthesis of Mathematical Problems From Human Curricula	Jul 1, 2024	Mathematical Problem-Solving	CodeCode Available	1
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets	May 29, 2023	Bias DetectionCode Generation	CodeCode Available	1
MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions	May 29, 2024	BenchmarkingDialogue Understanding	CodeCode Available	1
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion	Mar 20, 2025	Data AugmentationMathematical Problem-Solving	CodeCode Available	1
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks	Apr 23, 2024	Mathematical Problem-SolvingQuestion Answering	CodeCode Available	1
Non-myopic Generation of Language Models for Reasoning and Planning	Oct 22, 2024	Computational EfficiencyLanguage Modelling	CodeCode Available	1

Show:10 25 50

← PrevPage 3 of 11Next →

No leaderboard results yet.