| Three Questions Concerning the Use of Large Language Models to Facilitate Mathematics Learning | Oct 20, 2023 | Mathematical Problem-SolvingPosition | —Unverified | 0 |
| Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving | Jan 28, 2025 | MathMathematical Problem-Solving | —Unverified | 0 |
| Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH | Jan 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Towards Spoken Mathematical Reasoning: Benchmarking Speech-based Models over Multi-faceted Math Problems | May 21, 2025 | BenchmarkingMath | —Unverified | 0 |
| The Buffer Mechanism for Multi-Step Information Reasoning in Language Models | May 24, 2024 | Mathematical Problem-Solving | —Unverified | 0 |
| VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning | Oct 30, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving | May 20, 2024 | GSM8KMath | —Unverified | 0 |
| Mixture-of-Instructions: Comprehensive Alignment of a Large Language Model through the Mixture of Diverse System Prompting Instructions | Apr 29, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Navigating Semantic Relations: Challenges for Language Models in Abstract Common-Sense Reasoning | Feb 19, 2025 | Common Sense ReasoningMathematical Problem-Solving | —Unverified | 0 |
| OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step | Jun 4, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| On Vanishing Variance in Transformer Length Generalization | Apr 3, 2025 | AttributeMathematical Problem-Solving | —Unverified | 0 |
| Performance Comparison of Large Language Models on Advanced Calculus Problems | Mar 5, 2025 | MathMathematical Problem-Solving | —Unverified | 0 |
| Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks | Apr 19, 2024 | Mathematical Problem-Solving | CodeCode Available | 0 |
| MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems | Mar 19, 2025 | Mathematical Problem-Solving | CodeCode Available | 0 |
| LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning | Jun 16, 2025 | Code GenerationMathematical Problem-Solving | CodeCode Available | 0 |
| Large Language Models for Mathematical Analysis | Dec 28, 2024 | Mathematical Problem-SolvingMathematical Reasoning | CodeCode Available | 0 |
| Decomposing Elements of Problem Solving: What "Math" Does RL Teach? | May 28, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 |
| Data Contamination Through the Lens of Time | Oct 16, 2023 | Mathematical Problem-Solving | CodeCode Available | 0 |
| HARDMath2: A Benchmark for Applied Mathematics Built by Students as Part of a Graduate Class | May 17, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 |
| Chain-of-Code Collapse: Reasoning Failures in LLMs via Adversarial Prompting in Code Generation | Jun 8, 2025 | Code GenerationMathematical Problem-Solving | CodeCode Available | 0 |
| GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory | Jun 18, 2024 | Code GenerationMathematical Problem-Solving | CodeCode Available | 0 |
| Exploring LLM Reasoning Through Controlled Prompt Variations | Apr 2, 2025 | GSM8KMathematical Problem-Solving | CodeCode Available | 0 |
| Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers | May 26, 2025 | Logical ReasoningMathematical Problem-Solving | CodeCode Available | 0 |
| Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange | Mar 30, 2024 | MathMathematical Problem-Solving | CodeCode Available | 0 |
| Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision | May 26, 2025 | HallucinationMath | CodeCode Available | 0 |