Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–450 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT	Apr 3, 2024	BenchmarkingGeneral Knowledge	CodeCode Available	1	5
Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models	Jun 4, 2025	Math	CodeCode Available	1	5
PECC: Problem Extraction and Coding Challenges	Apr 29, 2024	Code GenerationMath	CodeCode Available	1	5
GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning	May 30, 2021	MathMathematical Reasoning	CodeCode Available	1	5
Pretrained Language Models are Symbolic Mathematics Solvers too!	Oct 7, 2021	IngenuityLanguage Modelling	CodeCode Available	1	5
Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement	Sep 17, 2024	Active LearningDiversity	CodeCode Available	1	5
A Symbolic Character-Aware Model for Solving Geometry Problems	Aug 5, 2023	MathMulti-Label Classification	CodeCode Available	1	5
Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament	Jan 22, 2025	Math	CodeCode Available	1	5
From Zero to Hero: Convincing with Extremely Complicated Math	Apr 1, 2023	Math	CodeCode Available	1	5
From GAN to WGAN	Apr 18, 2019	Generative Adversarial NetworkMath	CodeCode Available	1	5
A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level	Dec 31, 2021	Few-Shot LearningLanguage Modelling	CodeCode Available	1	5
LLMThinkBench: Towards Basic Math Reasoning and Overthinking in Large Language Models	Jul 5, 2025	BenchmarkingGPU	CodeCode Available	1	5
Get an A in Math: Progressive Rectification Prompting	Dec 11, 2023	Math	CodeCode Available	1	5
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents	Nov 16, 2023	Math	CodeCode Available	1	5
CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning	Oct 14, 2024	MathMathematical Reasoning	CodeCode Available	1	5
Collective Constitutional AI: Aligning a Language Model with Public Input	Jun 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
A Categorical Archive of ChatGPT Failures	Feb 6, 2023	Math	CodeCode Available	1	5
Over-Reasoning and Redundant Calculation of Large Language Models	Jan 21, 2024	GSM8KMath	CodeCode Available	1	5
Problem-Oriented Segmentation and Retrieval: Case Study on Tutoring Conversations	Nov 12, 2024	MathRetrieval	CodeCode Available	1	5
Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs	Jan 11, 2025	MathMathematical Problem-Solving	CodeCode Available	1	5
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models	Apr 14, 2025	MambaMath	CodeCode Available	1	5
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation	Apr 15, 2025	MathQuantum Machine Learning	CodeCode Available	1	5
Forgotten Polygons: Multimodal Large Language Models are Shape-Blind	Feb 21, 2025	MathMathematical Problem-Solving	CodeCode Available	1	5
Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation Capabilities	Feb 17, 2025	Code GenerationHumanEval	CodeCode Available	1	5
Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers	Dec 7, 2023	MathMultiple-choice	CodeCode Available	1	5
DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning	Jul 4, 2024	AvgGSM8K	CodeCode Available	1	5
OJBench: A Competition Level Code Benchmark For Large Language Models	Jun 19, 2025	Math	CodeCode Available	1	5
FELM: Benchmarking Factuality Evaluation of Large Language Models	Oct 1, 2023	BenchmarkingMath	CodeCode Available	1	5
FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving	Feb 27, 2025	GSM8KMath	CodeCode Available	1	5
FormulaNet: A Benchmark Dataset for Mathematical Formula Detection	Aug 29, 2022	Math	CodeCode Available	1	5
Expression Syntax Information Bottleneck for Math Word Problems	Oct 24, 2023	Math	CodeCode Available	1	5
Explaining Datasets in Words: Statistical Models with Natural Language Parameters	Sep 13, 2024	ClusteringLanguage Modeling	CodeCode Available	1	5
EXAONE Deep: Reasoning Enhanced Language Models	Mar 16, 2025	Math	CodeCode Available	1	5
Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective	Jun 22, 2025	In-Context LearningLarge Language Model	CodeCode Available	1	5
CLEVR-Math: A Dataset for Compositional Language, Visual and Mathematical Reasoning	Aug 10, 2022	MathMathematical Reasoning	CodeCode Available	1	5
Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning	Sep 29, 2022	Logical ReasoningMath	CodeCode Available	1	5
BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing	Apr 2, 2025	3D ReconstructionBenchmarking	CodeCode Available	1	5
Non-Autoregressive Math Word Problem Solver with Unified Tree Structure	May 8, 2023	Mathvalid	CodeCode Available	1	5
NeMo-Inspector: A Visualization Tool for LLM Generation Analysis	May 1, 2025	GSM8KMath	CodeCode Available	1	5
Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational Curricula	Aug 8, 2024	GSM8KLanguage Modeling	CodeCode Available	1	5
Nerva: a Truly Sparse Implementation of Neural Networks	Jul 24, 2024	Math	CodeCode Available	1	5
Aioli: A Unified Optimization Framework for Language Model Data Mixing	Nov 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning	Sep 19, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	1	5
Neural-Symbolic Solver for Math Word Problems with Auxiliary Tasks	Jul 3, 2021	DecoderMath	CodeCode Available	1	5
CityGPT: Empowering Urban Spatial Cognition of Large Language Models	Jun 20, 2024	Code GenerationMath	CodeCode Available	1	5
Mathematical Capabilities of ChatGPT	Jan 31, 2023	Elementary MathematicsMath	CodeCode Available	1	5
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents	Aug 2, 2024	Code GenerationLarge Language Model	CodeCode Available	1	5
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees	Mar 11, 2025	ChatbotLanguage Modeling	CodeCode Available	1	5
Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning	Jun 4, 2023	Math	CodeCode Available	1	5
MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word Problem Solvers	Sep 2, 2021	MathMath Word Problem Solving	CodeCode Available	1	5

Show:10 25 50

← PrevPage 9 of 32Next →

No leaderboard results yet.