Mathematical Problem-Solving

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 106 papers

Title	Date	Tasks	Status	Hype
Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving	Jan 28, 2025	MathMathematical Problem-Solving	—Unverified	0
Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs	Jan 11, 2025	MathMathematical Problem-Solving	CodeCode Available	1
VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models	Jan 9, 2025	BenchmarkingMathematical Problem-Solving	CodeCode Available	1
Efficiently Serving LLM Reasoning Programs with Certaindex	Dec 30, 2024	Code GenerationMathematical Problem-Solving	CodeCode Available	3
Large Language Models for Mathematical Analysis	Dec 28, 2024	Mathematical Problem-SolvingMathematical Reasoning	CodeCode Available	0
Training and Evaluating Language Models with Template-based Data Generation	Nov 27, 2024	Data AugmentationMath	CodeCode Available	1
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?	Nov 25, 2024	HallucinationKnowledge Distillation	CodeCode Available	7
Kwai-STaR: Transform LLMs into State-Transition Reasoners	Nov 7, 2024	GSM8KMathematical Problem-Solving	—Unverified	0
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent	Nov 4, 2024	Logical ReasoningMathematical Problem-Solving	CodeCode Available	5
VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning	Oct 30, 2024	BenchmarkingHallucination	—Unverified	0
Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks	Oct 24, 2024	Logical ReasoningMathematical Problem-Solving	—Unverified	0
Non-myopic Generation of Language Models for Reasoning and Planning	Oct 22, 2024	Computational EfficiencyLanguage Modelling	CodeCode Available	1
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning	Oct 8, 2024	GSM8KHallucination	—Unverified	0
Can LLMs plan paths with extra hints from solvers?	Oct 7, 2024	Mathematical Problem-SolvingProgram Synthesis	—Unverified	0
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning	Oct 3, 2024	Efficient ExplorationMathematical Problem-Solving	CodeCode Available	5
PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation	Oct 2, 2024	Data AugmentationDiversity	—Unverified	0
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search	Sep 26, 2024	MathMathematical Problem-Solving	CodeCode Available	1
Building Math Agents with Multi-Turn Iterative Preference Learning	Sep 4, 2024	GSM8KMath	—Unverified	0
Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems	Aug 29, 2024	GSM8KLanguage Modeling	—Unverified	0
Benchmarking Large Language Models for Math Reasoning Tasks	Aug 20, 2024	BenchmarkingIn-Context Learning	CodeCode Available	0
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine	Jul 11, 2024	Contrastive LearningLanguage Modelling	CodeCode Available	4
MathCAMPS: Fine-grained Synthesis of Mathematical Problems From Human Curricula	Jul 1, 2024	Mathematical Problem-Solving	CodeCode Available	1
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data	Jun 26, 2024	BenchmarkingMath	CodeCode Available	2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models	Jun 25, 2024	DiversityMath	CodeCode Available	2
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving	Jun 18, 2024	Arithmetic ReasoningMath	CodeCode Available	2
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory	Jun 18, 2024	Code GenerationMathematical Problem-Solving	CodeCode Available	0
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning	Jun 16, 2024	BenchmarkingMath	—Unverified	0
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward	Jun 11, 2024	Instruction FollowingMathematical Problem-Solving	—Unverified	0
OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step	Jun 4, 2024	Language ModelingLanguage Modelling	—Unverified	0
MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions	May 29, 2024	BenchmarkingDialogue Understanding	CodeCode Available	1
The Buffer Mechanism for Multi-Step Information Reasoning in Language Models	May 24, 2024	Mathematical Problem-Solving	—Unverified	0
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving	May 20, 2024	GSM8KMath	—Unverified	0
Mixture-of-Instructions: Comprehensive Alignment of a Large Language Model through the Mixture of Diverse System Prompting Instructions	Apr 29, 2024	Language ModelingLanguage Modelling	—Unverified	0
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks	Apr 23, 2024	Mathematical Problem-SolvingQuestion Answering	CodeCode Available	1
Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks	Apr 19, 2024	Mathematical Problem-Solving	CodeCode Available	0
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline	Apr 3, 2024	MathMathematical Problem-Solving	CodeCode Available	2
Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange	Mar 30, 2024	MathMathematical Problem-Solving	CodeCode Available	0
PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models	Mar 26, 2024	Code CompletionFew-Shot Learning	CodeCode Available	3
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models	Mar 12, 2024	MathMathematical Problem-Solving	CodeCode Available	0
Premise Order Matters in Reasoning with Large Language Models	Feb 14, 2024	GSM8KMathematical Problem-Solving	—Unverified	0
Large Language Models for Mathematical Reasoning: Progresses and Challenges	Jan 31, 2024	DiversityMath	—Unverified	0
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model	Dec 18, 2023	Language ModelingLanguage Modelling	CodeCode Available	4
Three Questions Concerning the Use of Large Language Models to Facilitate Mathematics Learning	Oct 20, 2023	Mathematical Problem-SolvingPosition	—Unverified	0
SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving	Oct 19, 2023	GSM8KMath	CodeCode Available	0
Data Contamination Through the Lens of Time	Oct 16, 2023	Mathematical Problem-Solving	CodeCode Available	0
The Consensus Game: Language Model Generation via Equilibrium Search	Oct 13, 2023	Language ModelingLanguage Modelling	—Unverified	0
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving	Sep 29, 2023	Arithmetic ReasoningComputational Efficiency	CodeCode Available	3
Beyond Traditional Teaching: The Potential of Large Language Models and Chatbots in Graduate Engineering Education	Sep 9, 2023	ChatbotMathematical Problem-Solving	—Unverified	0
Bayesian artificial brain with ChatGPT	Aug 28, 2023	Mathematical Problem-Solving	—Unverified	0
JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving	Jun 19, 2023	In-Context LearningLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.