Mathematical Problem-Solving

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 106 papers

Title	Date	Tasks	Status
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning	Jun 16, 2024	BenchmarkingMath	—Unverified
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning	Oct 8, 2024	GSM8KHallucination	—Unverified
Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models	Apr 9, 2025	Instruction FollowingMathematical Problem-Solving	—Unverified
How Do Large Language Monkeys Get Their Power (Laws)?	Feb 24, 2025	Language ModelingLanguage Modelling	—Unverified
Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks	Oct 24, 2024	Logical ReasoningMathematical Problem-Solving	—Unverified
JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving	Jun 19, 2023	In-Context LearningLanguage Modeling	—Unverified
Kwai-STaR: Transform LLMs into State-Transition Reasoners	Nov 7, 2024	GSM8KMathematical Problem-Solving	—Unverified
Large Language Models for Mathematical Reasoning: Progresses and Challenges	Jan 31, 2024	DiversityMath	—Unverified
LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models	Apr 3, 2025	Mathematical Problem-SolvingPrompt Engineering	—Unverified
Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems	Aug 29, 2024	GSM8KLanguage Modeling	—Unverified
MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection	Mar 23, 2025	MathMathematical Problem-Solving	—Unverified
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task	Feb 17, 2025	Code CompletionGSM8K	—Unverified
PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation	Oct 2, 2024	Data AugmentationDiversity	—Unverified
PoLAR: Polar-Decomposed Low-Rank Adapter Representation	Jun 3, 2025	Mathematical Problem-SolvingRiemannian optimization	—Unverified
Premise Order Matters in Reasoning with Large Language Models	Feb 14, 2024	GSM8KMathematical Problem-Solving	—Unverified
Reasoning Models Can Be Effective Without Thinking	Apr 14, 2025	Automated Theorem ProvingMathematical Problem-Solving	—Unverified
Scaling Autonomous Agents via Automatic Reward Modeling And Planning	Feb 17, 2025	Decision MakingMathematical Problem-Solving	—Unverified
Scaling Laws for Autoregressive Generative Modeling	Oct 28, 2020	Mathematical Problem-Solving	—Unverified
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models	Feb 25, 2025	Continual LearningGSM8K	—Unverified
Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models	Mar 4, 2025	GSM8KMath	—Unverified
SMART: Self-Generating and Self-Validating Multi-Dimensional Assessment for LLMs' Mathematical Problem Solving	May 22, 2025	DiagnosticMathematical Problem-Solving	—Unverified
STRIVE: Structured Reasoning for Self-Improvement in Claim Verification	Feb 17, 2025	Claim VerificationMathematical Problem-Solving	—Unverified
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving	Feb 17, 2025	MathMathematical Problem-Solving	—Unverified
TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving	Jun 12, 2025	Logical ReasoningMathematical Problem-Solving	—Unverified
The Consensus Game: Language Model Generation via Equilibrium Search	Oct 13, 2023	Language ModelingLanguage Modelling	—Unverified

Show:10 25 50

← PrevPage 3 of 5Next →

No leaderboard results yet.