| DRA-GRPO: Exploring Diversity-Aware Reward Adjustment for R1-Zero-Like Training of Large Language Models | May 14, 2025 | DiversityMathematical Reasoning | CodeCode Available | 1 | 5 |
| DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning | Jul 4, 2024 | AvgGSM8K | CodeCode Available | 1 | 5 |
| Learning From Mistakes Makes LLM Better Reasoner | Oct 31, 2023 | GSM8KMath | CodeCode Available | 1 | 5 |
| Large Language Models for Multi-Robot Systems: A Survey | Feb 6, 2025 | Action GenerationBenchmarking | CodeCode Available | 1 | 5 |
| LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning | Jan 15, 2021 | Inductive BiasMathematical Reasoning | CodeCode Available | 1 | 5 |
| ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation | May 27, 2024 | Code GenerationHumanEval | CodeCode Available | 1 | 5 |
| Rewriting Pre-Training Data Boosts LLM Performance in Math and Code | May 5, 2025 | Code GenerationGSM8K | CodeCode Available | 1 | 5 |
| RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics | May 18, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning | May 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models | May 23, 2024 | Knowledge DistillationMath | CodeCode Available | 1 | 5 |
| KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning | May 22, 2025 | Mathematical Reasoningreinforcement-learning | CodeCode Available | 1 | 5 |
| Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning | May 10, 2021 | Arithmetic ReasoningGeometry Problem Solving | CodeCode Available | 1 | 5 |
| RaDeR: Reasoning-aware Dense Retrieval Models | May 23, 2025 | MathMathematical Problem-Solving | CodeCode Available | 1 | 5 |
| Reasoning on a Budget: Miniaturizing DeepSeek R1 with SFT-GRPO Alignment for Instruction-Tuned LLMs | May 16, 2025 | Deep Reinforcement LearningMathematical Reasoning | CodeCode Available | 1 | 5 |
| Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver | Sep 6, 2024 | Geometry Problem SolvingMathematical Reasoning | CodeCode Available | 1 | 5 |
| An In-depth Look at Gemini's Language Abilities | Dec 18, 2023 | Instruction FollowingMath | CodeCode Available | 1 | 5 |
| DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning | May 29, 2025 | Automated Theorem ProvingMathematical Reasoning | CodeCode Available | 1 | 5 |
| OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling | Jul 13, 2024 | BenchmarkingMath | CodeCode Available | 1 | 5 |
| A Dual-Space Framework for General Knowledge Distillation of Large Language Models | Apr 15, 2025 | Code GenerationGeneral Knowledge | CodeCode Available | 1 | 5 |
| MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion | Mar 20, 2025 | Data AugmentationMathematical Problem-Solving | CodeCode Available | 1 | 5 |
| IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning | Oct 25, 2021 | Arithmetic ReasoningMathematical Question Answering | CodeCode Available | 1 | 5 |
| Implicit Reasoning in Transformers is Reasoning through Shortcuts | Mar 10, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination | Jul 14, 2025 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level | Dec 31, 2021 | Few-Shot LearningLanguage Modelling | CodeCode Available | 1 | 5 |
| Process-Driven Autoformalization in Lean 4 | Jun 4, 2024 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| PromptCoT: Synthesizing Olympiad-level Problems for Mathematical Reasoning in Large Language Models | Mar 4, 2025 | GSM8KMath | CodeCode Available | 1 | 5 |
| Crosslingual Reasoning through Test-Time Scaling | May 8, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization | Jul 8, 2025 | Active LearningAutomated Theorem Proving | CodeCode Available | 1 | 5 |
| Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability | Nov 29, 2024 | GSM8KMath | CodeCode Available | 1 | 5 |
| A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models | Oct 21, 2022 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning | Feb 19, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| Auto-Regressive Next-Token Predictors are Universal Learners | Sep 13, 2023 | Mathematical ReasoningText Generation | CodeCode Available | 1 | 5 |
| HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics | Oct 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought | Feb 24, 2025 | Mathematical ReasoningMisinformation | CodeCode Available | 1 | 5 |
| AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence | Feb 19, 2025 | Code GenerationDecision Making | CodeCode Available | 1 | 5 |
| Peano: Learning Formal Mathematical Reasoning | Nov 29, 2022 | Automated Theorem ProvingMathematical Reasoning | CodeCode Available | 1 | 5 |
| OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning | Nov 16, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 | 5 |
| PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization | Sep 25, 2024 | 8kDomain Adaptation | CodeCode Available | 1 | 5 |
| A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram | Feb 22, 2023 | Geometry Problem SolvingMathematical Reasoning | CodeCode Available | 1 | 5 |
| Control LLM: Controlled Evolution for Intelligence Retention in LLM | Jan 19, 2025 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning | May 30, 2025 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent | Dec 14, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| GRPO-LEAD: A Difficulty-Aware Reinforcement Learning Approach for Concise Mathematical Reasoning in Language Models | Apr 13, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models | Feb 22, 2024 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| Augmenting Math Word Problems via Iterative Question Composing | Jan 17, 2024 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models | Aug 30, 2024 | Image CaptioningLanguage Modeling | CodeCode Available | 1 | 5 |
| GOLD: Geometry Problem Solver with Natural Language Description | May 1, 2024 | Math | CodeCode Available | 1 | 5 |
| Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs | Jan 11, 2025 | MathMathematical Problem-Solving | CodeCode Available | 1 | 5 |
| CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning | Oct 14, 2024 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling | May 25, 2025 | Computational EfficiencyMathematical Reasoning | CodeCode Available | 1 | 5 |