| Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs | Jan 11, 2025 | MathMathematical Problem-Solving | CodeCode Available | 1 | 5 |
| PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization | Sep 25, 2024 | 8kDomain Adaptation | CodeCode Available | 1 | 5 |
| Lila: A Unified Benchmark for Mathematical Reasoning | Oct 31, 2022 | DiversityMathematical Reasoning | CodeCode Available | 1 | 5 |
| Large Language Models for Multi-Robot Systems: A Survey | Feb 6, 2025 | Action GenerationBenchmarking | CodeCode Available | 1 | 5 |
| Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents | Feb 18, 2024 | Mathematical ReasoningMulti-hop Question Answering | CodeCode Available | 1 | 5 |
| JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models | May 23, 2024 | Knowledge DistillationMath | CodeCode Available | 1 | 5 |
| Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems | Oct 24, 2024 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning | May 10, 2021 | Arithmetic ReasoningGeometry Problem Solving | CodeCode Available | 1 | 5 |
| Implicit Reasoning in Transformers is Reasoning through Shortcuts | Mar 10, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models | Aug 30, 2024 | Image CaptioningLanguage Modeling | CodeCode Available | 1 | 5 |
| MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data | Feb 14, 2024 | Automated Theorem ProvingLanguage Modelling | CodeCode Available | 1 | 5 |
| H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables | Jun 29, 2024 | Fact VerificationMathematical Reasoning | CodeCode Available | 1 | 5 |
| Breaking the Data Barrier -- Building GUI Agents Through Task Generalization | Apr 14, 2025 | Mathematical ReasoningMultimodal Reasoning | CodeCode Available | 1 | 5 |
| LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning | Jan 15, 2021 | Inductive BiasMathematical Reasoning | CodeCode Available | 1 | 5 |
| Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent | Dec 14, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations | Oct 31, 2023 | GSM8KMath | CodeCode Available | 1 | 5 |
| Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond | Jun 16, 2023 | BenchmarkingEvidence Selection | CodeCode Available | 1 | 5 |
| Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency | Apr 24, 2025 | BenchmarkingMath | CodeCode Available | 1 | 5 |
| A Reinforcement Learning Environment for Mathematical Reasoning via Program Synthesis | Jul 15, 2021 | Mathematical ReasoningProgram Synthesis | CodeCode Available | 1 | 5 |
| IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning | Oct 25, 2021 | Arithmetic ReasoningMathematical Question Answering | CodeCode Available | 1 | 5 |
| KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning | May 22, 2025 | Mathematical Reasoningreinforcement-learning | CodeCode Available | 1 | 5 |
| IsarStep: a Benchmark for High-level Mathematical Reasoning | Jun 13, 2020 | Mathematical ProofsMathematical Reasoning | CodeCode Available | 1 | 5 |
| Natural Language Reasoning, A Survey | Mar 26, 2023 | Logical ReasoningMathematical Reasoning | CodeCode Available | 1 | 5 |
| Peano: Learning Formal Mathematical Reasoning | Nov 29, 2022 | Automated Theorem ProvingMathematical Reasoning | CodeCode Available | 1 | 5 |
| Boosting MLLM Reasoning with Text-Debiased Hint-GRPO | Mar 31, 2025 | Mathematical ReasoningMultimodal Reasoning | CodeCode Available | 1 | 5 |