| Benchmarking Large Language Models for Math Reasoning Tasks | Aug 20, 2024 | BenchmarkingIn-Context Learning | CodeCode Available | 0 | 5 |
| Decomposing Elements of Problem Solving: What "Math" Does RL Teach? | May 28, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 | 5 |
| Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation | Jul 2, 2024 | Code GenerationForm | CodeCode Available | 0 | 5 |
| Instructing Large Language Models to Identify and Ignore Irrelevant Conditions | Mar 19, 2024 | MathMathematical Reasoning | CodeCode Available | 0 | 5 |
| Compositional Generalization with Tree Stack Memory Units | Nov 5, 2019 | Mathematical ReasoningZero-shot Generalization | CodeCode Available | 0 | 5 |
| MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO | May 19, 2025 | DecoderImage Generation | CodeCode Available | 0 | 5 |
| MC-NEST -- Enhancing Mathematical Reasoning in Large Language Models with a Monte Carlo Nash Equilibrium Self-Refine Tree | Nov 23, 2024 | Decision MakingMathematical Reasoning | CodeCode Available | 0 | 5 |
| An Efficient and Precise Training Data Construction Framework for Process-supervised Reward Model in Mathematical Reasoning | Mar 4, 2025 | Mathematical Reasoning | CodeCode Available | 0 | 5 |
| Math Word Problem Solving by Generating Linguistic Variants of Problem Statements | Jun 24, 2023 | DecoderIngenuity | CodeCode Available | 0 | 5 |
| MCC-KD: Multi-CoT Consistent Knowledge Distillation | Oct 23, 2023 | DiversityKnowledge Distillation | CodeCode Available | 0 | 5 |
| MathScale: Scaling Instruction Tuning for Mathematical Reasoning | Mar 5, 2024 | GSM8KMath | CodeCode Available | 0 | 5 |
| Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying | Dec 19, 2024 | MathMathematical Reasoning | CodeCode Available | 0 | 5 |
| MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark | Aug 14, 2024 | MathMathematical Reasoning | CodeCode Available | 0 | 5 |
| MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning | Feb 27, 2024 | 8kLanguage Modeling | CodeCode Available | 0 | 5 |
| How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective | Oct 14, 2024 | Density Ratio EstimationGSM8K | CodeCode Available | 0 | 5 |
| Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges | Feb 12, 2025 | GSM8KMath | CodeCode Available | 0 | 5 |
| How Do Humans Write Code? Large Models Do It the Same Way Too | Feb 24, 2024 | Code GenerationMath | CodeCode Available | 0 | 5 |
| Analysing Mathematical Reasoning Abilities of Neural Models | Apr 2, 2019 | Mathematical Question AnsweringMathematical Reasoning | CodeCode Available | 0 | 5 |
| Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic Reasoning | Jun 5, 2025 | Arithmetic ReasoningMath | CodeCode Available | 0 | 5 |
| Hierarchical Attention Generates Better Proofs | Apr 27, 2025 | Automated Theorem ProvingMathematical Proofs | CodeCode Available | 0 | 5 |
| Adaptive Graph Pruning for Multi-Agent Communication | Jun 3, 2025 | Code GenerationLarge Language Model | CodeCode Available | 0 | 5 |
| HARDMath2: A Benchmark for Applied Mathematics Built by Students as Part of a Graduate Class | May 17, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 | 5 |
| Guided Stream of Search: Learning to Better Search with Language Models via Optimal Path Guidance | Oct 3, 2024 | Mathematical Reasoning | CodeCode Available | 0 | 5 |
| MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty | Aug 13, 2024 | Mathematical ReasoningQuestion Answering | CodeCode Available | 0 | 5 |
| MARGE: Improving Math Reasoning for LLMs with Guided Exploration | May 18, 2025 | MathMathematical Reasoning | CodeCode Available | 0 | 5 |