| MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams | Mar 26, 2025 | Mathematical ReasoningObject Counting | —Unverified | 0 |
| MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large Language Model | Sep 10, 2024 | DiversityLanguage Modeling | —Unverified | 0 |
| MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs | Oct 7, 2024 | Information RetrievalMathematical Reasoning | —Unverified | 0 |
| MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations | Feb 10, 2025 | BenchmarkingIn-Context Learning | —Unverified | 0 |
| math-PVS: A Large Language Model Framework to Map Scientific Publications to PVS Theories | Oct 25, 2023 | Automated Theorem ProvingLanguage Modeling | —Unverified | 0 |
| MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? | Mar 21, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| MCP-RADAR: A Multi-Dimensional Benchmark for Evaluating Tool Use Capabilities in Large Language Models | May 22, 2025 | Mathematical Reasoning | —Unverified | 0 |
| MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs | Feb 3, 2025 | Mathematical ReasoningMixture-of-Experts | —Unverified | 0 |
| ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models | Jun 13, 2024 | Code Generationdomain classification | —Unverified | 0 |
| INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models | Sep 28, 2024 | MathMathematical Reasoning | —Unverified | 0 |