| B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners | Dec 23, 2024 | Mathematical Reasoning | CodeCode Available | 2 | 5 |
| SOLO: A Single Transformer for Scalable Vision-Language Modeling | Jul 8, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning | Feb 10, 2025 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| Efficient Reinforcement Finetuning via Adaptive Curriculum Learning | Apr 7, 2025 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| Reinforcing General Reasoning without Verifiers | May 27, 2025 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs | Feb 4, 2025 | Code GenerationLanguage Modeling | CodeCode Available | 2 | 5 |
| An Expression Tree Decoding Strategy for Mathematical Equation Generation | Oct 14, 2023 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| LoRA-Pro: Are Low-Rank Adapters Properly Optimized? | Jul 25, 2024 | Code GenerationComputational Efficiency | CodeCode Available | 2 | 5 |
| MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code | Oct 10, 2024 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| MathPile: A Billion-Token-Scale Pretraining Corpus for Math | Dec 28, 2023 | Language IdentificationMath | CodeCode Available | 2 | 5 |
| Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization | Apr 8, 2025 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| Scaling Language Models: Methods, Analysis & Insights from Training Gopher | Dec 8, 2021 | Abstract AlgebraAnachronisms | CodeCode Available | 2 | 5 |
| Benchmarking Benchmark Leakage in Large Language Models | Apr 29, 2024 | BenchmarkingMathematical Reasoning | CodeCode Available | 2 | 5 |
| Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning | Jun 17, 2024 | Data AugmentationMathematical Reasoning | CodeCode Available | 2 | 5 |
| Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models | Jun 25, 2024 | DiversityMath | CodeCode Available | 2 | 5 |
| Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning | Jun 11, 2025 | Image CaptioningMath | CodeCode Available | 2 | 5 |
| A Survey of Deep Learning for Mathematical Reasoning | Dec 20, 2022 | Deep LearningMath | CodeCode Available | 2 | 5 |
| MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data | Jun 26, 2024 | BenchmarkingMath | CodeCode Available | 2 | 5 |
| ProcessBench: Identifying Process Errors in Mathematical Reasoning | Dec 9, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| MegaMath: Pushing the Limits of Open Math Corpora | Apr 3, 2025 | DiversityMath | CodeCode Available | 2 | 5 |
| Preference Optimization for Reasoning with Pseudo Feedback | Nov 25, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning | Oct 9, 2023 | Arithmetic ReasoningData Augmentation | CodeCode Available | 2 | 5 |
| LangBridge: Multilingual Reasoning Without Multilingual Supervision | Jan 19, 2024 | Code CompletionLogical Reasoning | CodeCode Available | 2 | 5 |
| DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization | May 18, 2025 | Mathematical Reasoning | CodeCode Available | 2 | 5 |
| Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset | Feb 22, 2024 | DiversityMath | CodeCode Available | 2 | 5 |
| Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts | Feb 12, 2024 | Continual PretrainingGSM8K | CodeCode Available | 2 | 5 |
| LeanAgent: Lifelong Learning for Formal Theorem Proving | Oct 8, 2024 | Abstract AlgebraAutomated Theorem Proving | CodeCode Available | 2 | 5 |
| Reformatted Alignment | Feb 19, 2024 | GSM8KHallucination | CodeCode Available | 2 | 5 |
| Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics Learning | Jun 23, 2025 | GPULarge Language Model | CodeCode Available | 2 | 5 |
| Offline Reinforcement Learning for LLM Multi-Step Reasoning | Dec 20, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models | Oct 10, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| Compression Represents Intelligence Linearly | Apr 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning | Nov 18, 2024 | Mathematical Reasoning | CodeCode Available | 2 | 5 |
| Multi-View Reasoning: Consistent Contrastive Learning for Math Word Problem | Oct 21, 2022 | Contrastive LearningMath | CodeCode Available | 2 | 5 |
| Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models | May 24, 2024 | Atari GamesMathematical Reasoning | CodeCode Available | 2 | 5 |
| LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation | Apr 10, 2025 | Code GenerationContinual Learning | CodeCode Available | 2 | 5 |
| MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning | Jun 5, 2025 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models | Sep 4, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate | Jan 29, 2025 | Instruction FollowingMath | CodeCode Available | 2 | 5 |
| O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning | Jan 22, 2025 | Mathematical Reasoning | CodeCode Available | 2 | 5 |
| Optimizing Anytime Reasoning via Budget Relative Policy Optimization | May 19, 2025 | Mathematical ReasoningReinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT? | Apr 16, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| CLEVR-Math: A Dataset for Compositional Language, Visual and Mathematical Reasoning | Aug 10, 2022 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables | Jun 29, 2024 | Fact VerificationMathematical Reasoning | CodeCode Available | 1 | 5 |
| Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning | May 30, 2025 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| Ada-Instruct: Adapting Instruction Generators for Complex Reasoning | Oct 6, 2023 | Code CompletionIn-Context Learning | CodeCode Available | 1 | 5 |
| IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning | Oct 25, 2021 | Arithmetic ReasoningMathematical Question Answering | CodeCode Available | 1 | 5 |
| MathPrompter: Mathematical Reasoning using Large Language Models | Mar 4, 2023 | Arithmetic ReasoningMath | CodeCode Available | 1 | 5 |
| MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion | Mar 20, 2025 | Data AugmentationMathematical Problem-Solving | CodeCode Available | 1 | 5 |
| Implicit Reasoning in Transformers is Reasoning through Shortcuts | Mar 10, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |