Mathematical Problem-Solving

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 106 papers

Title	Date	Tasks	Status
Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs	May 16, 2025	Mathematical Problem-SolvingReinforcement Learning (RL)	—Unverified
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning	May 14, 2025	MathMathematical Problem-Solving	CodeCode Available
Reasoning Models Can Be Effective Without Thinking	Apr 14, 2025	Automated Theorem ProvingMathematical Problem-Solving	—Unverified
Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models	Apr 9, 2025	Instruction FollowingMathematical Problem-Solving	—Unverified
LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models	Apr 3, 2025	Mathematical Problem-SolvingPrompt Engineering	—Unverified
On Vanishing Variance in Transformer Length Generalization	Apr 3, 2025	AttributeMathematical Problem-Solving	—Unverified
Exploring LLM Reasoning Through Controlled Prompt Variations	Apr 2, 2025	GSM8KMathematical Problem-Solving	CodeCode Available
Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics	Apr 1, 2025	MathMathematical Problem-Solving	—Unverified
MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection	Mar 23, 2025	MathMathematical Problem-Solving	—Unverified
A Survey on Mathematical Reasoning and Optimization with Large Language Models	Mar 22, 2025	Automated Theorem ProvingHeuristic Search	CodeCode Available
Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study	Mar 21, 2025	AttributeMathematical Problem-Solving	CodeCode Available
MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems	Mar 19, 2025	Mathematical Problem-Solving	CodeCode Available
Performance Comparison of Large Language Models on Advanced Calculus Problems	Mar 5, 2025	MathMathematical Problem-Solving	—Unverified
Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models	Mar 4, 2025	GSM8KMath	—Unverified
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models	Feb 25, 2025	Continual LearningGSM8K	—Unverified
How Do Large Language Monkeys Get Their Power (Laws)?	Feb 24, 2025	Language ModelingLanguage Modelling	—Unverified
Navigating Semantic Relations: Challenges for Language Models in Abstract Common-Sense Reasoning	Feb 19, 2025	Common Sense ReasoningMathematical Problem-Solving	—Unverified
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task	Feb 17, 2025	Code CompletionGSM8K	—Unverified
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving	Feb 17, 2025	MathMathematical Problem-Solving	—Unverified
STRIVE: Structured Reasoning for Self-Improvement in Claim Verification	Feb 17, 2025	Claim VerificationMathematical Problem-Solving	—Unverified
Scaling Autonomous Agents via Automatic Reward Modeling And Planning	Feb 17, 2025	Decision MakingMathematical Problem-Solving	—Unverified
Advancing Reasoning in Large Language Models: Promising Methods and Approaches	Feb 5, 2025	Mathematical Problem-SolvingSurvey	—Unverified
Automating Mathematical Proof Generation Using Large Language Model Agents and Knowledge Graphs	Feb 4, 2025	Formal LogicKnowledge Graphs	—Unverified
Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH	Jan 30, 2025	Language ModelingLanguage Modelling	—Unverified
Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving	Jan 28, 2025	MathMathematical Problem-Solving	—Unverified
Large Language Models for Mathematical Analysis	Dec 28, 2024	Mathematical Problem-SolvingMathematical Reasoning	CodeCode Available
Kwai-STaR: Transform LLMs into State-Transition Reasoners	Nov 7, 2024	GSM8KMathematical Problem-Solving	—Unverified
VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning	Oct 30, 2024	BenchmarkingHallucination	—Unverified
Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks	Oct 24, 2024	Logical ReasoningMathematical Problem-Solving	—Unverified
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning	Oct 8, 2024	GSM8KHallucination	—Unverified
Can LLMs plan paths with extra hints from solvers?	Oct 7, 2024	Mathematical Problem-SolvingProgram Synthesis	—Unverified
PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation	Oct 2, 2024	Data AugmentationDiversity	—Unverified
Building Math Agents with Multi-Turn Iterative Preference Learning	Sep 4, 2024	GSM8KMath	—Unverified
Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems	Aug 29, 2024	GSM8KLanguage Modeling	—Unverified
Benchmarking Large Language Models for Math Reasoning Tasks	Aug 20, 2024	BenchmarkingIn-Context Learning	CodeCode Available
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory	Jun 18, 2024	Code GenerationMathematical Problem-Solving	CodeCode Available
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning	Jun 16, 2024	BenchmarkingMath	—Unverified
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward	Jun 11, 2024	Instruction FollowingMathematical Problem-Solving	—Unverified
OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step	Jun 4, 2024	Language ModelingLanguage Modelling	—Unverified
The Buffer Mechanism for Multi-Step Information Reasoning in Language Models	May 24, 2024	Mathematical Problem-Solving	—Unverified
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving	May 20, 2024	GSM8KMath	—Unverified
Mixture-of-Instructions: Comprehensive Alignment of a Large Language Model through the Mixture of Diverse System Prompting Instructions	Apr 29, 2024	Language ModelingLanguage Modelling	—Unverified
Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks	Apr 19, 2024	Mathematical Problem-Solving	CodeCode Available
Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange	Mar 30, 2024	MathMathematical Problem-Solving	CodeCode Available
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models	Mar 12, 2024	MathMathematical Problem-Solving	CodeCode Available
Premise Order Matters in Reasoning with Large Language Models	Feb 14, 2024	GSM8KMathematical Problem-Solving	—Unverified
Large Language Models for Mathematical Reasoning: Progresses and Challenges	Jan 31, 2024	DiversityMath	—Unverified
Three Questions Concerning the Use of Large Language Models to Facilitate Mathematics Learning	Oct 20, 2023	Mathematical Problem-SolvingPosition	—Unverified
SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving	Oct 19, 2023	GSM8KMath	CodeCode Available
Data Contamination Through the Lens of Time	Oct 16, 2023	Mathematical Problem-Solving	CodeCode Available

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.