SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–475 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
Pretrained Language Models are Symbolic Mathematics Solvers too!	Oct 7, 2021	IngenuityLanguage Modelling	CodeCode Available	1	5
Reasoning with Reinforced Functional Token Tuning	Feb 19, 2025	Math	CodeCode Available	1	5
MathViz-E: A Case-study in Domain-Specialized Tool-Using Agents	Jul 24, 2024	Math	CodeCode Available	1	5
Eliciting Latent Knowledge from Quirky Language Models	Dec 2, 2023	Anomaly DetectionMath	CodeCode Available	1	5
PECC: Problem Extraction and Coding Challenges	Apr 29, 2024	Code GenerationMath	CodeCode Available	1	5
Eliminating Position Bias of Language Models: A Mechanistic Approach	Jul 1, 2024	Mathobject-detection	CodeCode Available	1	5
Explaining Datasets in Words: Statistical Models with Natural Language Parameters	Sep 13, 2024	ClusteringLanguage Modeling	CodeCode Available	1	5
Aioli: A Unified Optimization Framework for Language Model Data Mixing	Nov 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
CityGPT: Empowering Urban Spatial Cognition of Large Language Models	Jun 20, 2024	Code GenerationMath	CodeCode Available	1	5
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents	Aug 2, 2024	Code GenerationLarge Language Model	CodeCode Available	1	5
Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations	Oct 31, 2023	GSM8KMath	CodeCode Available	1	5
Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective	Jun 22, 2025	In-Context LearningLarge Language Model	CodeCode Available	1	5
EXAONE Deep: Reasoning Enhanced Language Models	Mar 16, 2025	Math	CodeCode Available	1	5
Over-Reasoning and Redundant Calculation of Large Language Models	Jan 21, 2024	GSM8KMath	CodeCode Available	1	5
ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models	May 23, 2023	Math	CodeCode Available	1	5
ArMATH: a Dataset for Solving Arabic Math Word Problems	Jun 1, 2022	Deep LearningMath	CodeCode Available	1	5
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping	Feb 16, 2025	Code GenerationInstruction Following	CodeCode Available	1	5
Ape210K: A Large-Scale and Template-Rich Dataset of Math Word Problems	Sep 24, 2020	DiversityMath	CodeCode Available	1	5
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics	Oct 28, 2024	Arithmetic ReasoningMath	CodeCode Available	1	5
Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational Curricula	Aug 8, 2024	GSM8KLanguage Modeling	CodeCode Available	1	5
Expression Syntax Information Bottleneck for Math Word Problems	Oct 24, 2023	Math	CodeCode Available	1	5
Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament	Jan 22, 2025	Math	CodeCode Available	1	5
Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts	Oct 23, 2023	Logical ReasoningMath	CodeCode Available	1	5
OJBench: A Competition Level Code Benchmark For Large Language Models	Jun 19, 2025	Math	CodeCode Available	1	5
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation	Dec 28, 2023	GSM8KLanguage Model Evaluation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 19 of 64Next →

No leaderboard results yet.