| MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision | May 19, 2025 | MathMathematical Reasoning | CodeCode Available | 4 |
| AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset | Apr 23, 2025 | MathMathematical Reasoning | CodeCode Available | 4 |
| ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates | Feb 10, 2025 | Hierarchical Reinforcement LearningLanguage Modeling | CodeCode Available | 4 |
| SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights | Oct 11, 2024 | GSM8KMath | CodeCode Available | 4 |
| OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data | Oct 2, 2024 | Arithmetic ReasoningLarge Language Model | CodeCode Available | 4 |
| LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover | Jul 24, 2024 | Automated Theorem ProvingMath | CodeCode Available | 4 |
| MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine | Jul 11, 2024 | Contrastive LearningLanguage Modelling | CodeCode Available | 4 |
| Knowledge Fusion of Large Language Models | Jan 19, 2024 | Code GenerationCommon Sense Reasoning | CodeCode Available | 4 |
| ChatGPT for Robotics: Design Principles and Model Abilities | Feb 20, 2023 | Mathematical ReasoningPrompt Engineering | CodeCode Available | 4 |
| Galactica: A Large Language Model for Science | Nov 16, 2022 | AnachronismsBias Detection | CodeCode Available | 4 |
| Spurious Rewards: Rethinking Training Signals in RLVR | Jun 12, 2025 | MathMathematical Reasoning | CodeCode Available | 3 |
| MathArena: Evaluating LLMs on Uncontaminated Math Competitions | May 29, 2025 | MathMathematical Reasoning | CodeCode Available | 3 |
| General-Reasoner: Advancing LLM Reasoning Across All Domains | May 20, 2025 | AllMath | CodeCode Available | 3 |
| MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem | May 20, 2025 | Mathematical Reasoningscientific discovery | CodeCode Available | 3 |
| MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning | May 15, 2025 | cross-modal alignmentGeometry Problem Solving | CodeCode Available | 3 |
| Reinforcement Learning for Reasoning in Large Language Models with One Training Example | Apr 29, 2025 | Domain GeneralizationMath | CodeCode Available | 3 |
| DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning | Apr 15, 2025 | Mathematical ReasoningReinforcement Learning (RL) | CodeCode Available | 3 |
| MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs | Apr 1, 2025 | Knowledge GraphsMathematical Reasoning | CodeCode Available | 3 |
| Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't | Mar 20, 2025 | Mathematical ReasoningReinforcement Learning (RL) | CodeCode Available | 3 |
| Self-rewarding correction for mathematical reasoning | Feb 26, 2025 | Mathematical Reasoning | CodeCode Available | 3 |
| Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs | Jun 26, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 3 |
| Step-level Value Preference Optimization for Mathematical Reasoning | Jun 16, 2024 | Learning-To-RankMath | CodeCode Available | 3 |
| MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning | May 20, 2024 | Continual PretrainingMathematical Reasoning | CodeCode Available | 3 |
| MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning | May 13, 2024 | Data AugmentationGSM8K | CodeCode Available | 3 |
| AlphaMath Almost Zero: Process Supervision without Process | May 6, 2024 | Mathematical ReasoningMath Word Problem Solving | CodeCode Available | 3 |