| Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts | Jun 24, 2024 | Mathematical ReasoningVisual Question Answering (VQA) | —Unverified | 0 | 0 |
| Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models | Dec 13, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| LPML: LLM-Prompting Markup Language for Mathematical Reasoning | Sep 21, 2023 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection | Nov 13, 2024 | Code GenerationMathematical Reasoning | —Unverified | 0 | 0 |
| Machine learning and information theory concepts towards an AI Mathematician | Mar 7, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| MAPS: A Multilingual Benchmark for Global Agent Performance and Security | May 21, 2025 | Code GenerationMath | —Unverified | 0 | 0 |
| Markov Chain of Thought for Efficient Mathematical Reasoning | Oct 23, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Mars-PO: Multi-Agent Reasoning System Preference Optimization | Nov 28, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality | Jun 17, 2025 | Code GenerationMathematical Reasoning | —Unverified | 0 | 0 |
| MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications | Nov 28, 2024 | document understandingMathematical Reasoning | —Unverified | 0 | 0 |
| MathDivide: Improved mathematical reasoning by large language models | May 12, 2024 | GSM8KLogical Reasoning | —Unverified | 0 | 0 |
| Assessment of Evolving Large Language Models in Upper Secondary Mathematics | Apr 15, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Mathematical Reasoning in Latent Space | Sep 26, 2019 | Mathematical Reasoning | —Unverified | 0 | 0 |
| MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task | Feb 17, 2025 | Code CompletionGSM8K | —Unverified | 0 | 0 |
| MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs | Feb 26, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams | Mar 26, 2025 | Mathematical ReasoningObject Counting | —Unverified | 0 | 0 |
| MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large Language Model | Sep 10, 2024 | DiversityLanguage Modeling | —Unverified | 0 | 0 |
| MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs | Oct 7, 2024 | Information RetrievalMathematical Reasoning | —Unverified | 0 | 0 |
| MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations | Feb 10, 2025 | BenchmarkingIn-Context Learning | —Unverified | 0 | 0 |
| math-PVS: A Large Language Model Framework to Map Scientific Publications to PVS Theories | Oct 25, 2023 | Automated Theorem ProvingLanguage Modeling | —Unverified | 0 | 0 |
| MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? | Mar 21, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| MCP-RADAR: A Multi-Dimensional Benchmark for Evaluating Tool Use Capabilities in Large Language Models | May 22, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs | Feb 3, 2025 | Mathematical ReasoningMixture-of-Experts | —Unverified | 0 | 0 |
| ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models | Jun 13, 2024 | Code Generationdomain classification | —Unverified | 0 | 0 |
| INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models | Sep 28, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |