| Measuring Mathematical Problem Solving With the MATH Dataset | Mar 5, 2021 | MathMathematical Problem-Solving | CodeCode Available | 2 | 5 |
| Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics Learning | Jun 23, 2025 | GPULarge Language Model | CodeCode Available | 2 | 5 |
| Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving | May 12, 2025 | MathMathematical Problem-Solving | CodeCode Available | 2 | 5 |
| LangBridge: Multilingual Reasoning Without Multilingual Supervision | Jan 19, 2024 | Code CompletionLogical Reasoning | CodeCode Available | 2 | 5 |
| Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset | Feb 22, 2024 | DiversityMath | CodeCode Available | 2 | 5 |
| MegaMath: Pushing the Limits of Open Math Corpora | Apr 3, 2025 | DiversityMath | CodeCode Available | 2 | 5 |
| Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs | Feb 4, 2025 | Code GenerationLanguage Modeling | CodeCode Available | 2 | 5 |
| Efficient Reinforcement Finetuning via Adaptive Curriculum Learning | Apr 7, 2025 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data | Jun 26, 2024 | BenchmarkingMath | CodeCode Available | 2 | 5 |
| Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts | Feb 12, 2024 | Continual PretrainingGSM8K | CodeCode Available | 2 | 5 |
| MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts | Oct 3, 2023 | ChatbotImage Captioning | CodeCode Available | 2 | 5 |
| Compression Represents Intelligence Linearly | Apr 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers | Feb 29, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning | Jun 17, 2024 | Data AugmentationMathematical Reasoning | CodeCode Available | 2 | 5 |
| Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models | Jun 25, 2024 | DiversityMath | CodeCode Available | 2 | 5 |
| Scaling Relationship on Learning Mathematical Reasoning with Large Language Models | Aug 3, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 | 5 |
| A Survey of Deep Learning for Mathematical Reasoning | Dec 20, 2022 | Deep LearningMath | CodeCode Available | 2 | 5 |
| ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization | Feb 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models | Sep 21, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 | 5 |
| MathPile: A Billion-Token-Scale Pretraining Corpus for Math | Dec 28, 2023 | Language IdentificationMath | CodeCode Available | 2 | 5 |
| MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code | Oct 10, 2024 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning | Nov 18, 2024 | Mathematical Reasoning | CodeCode Available | 2 | 5 |
| MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning | Oct 5, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 | 5 |
| CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models | Sep 4, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation | Apr 10, 2025 | Code GenerationContinual Learning | CodeCode Available | 2 | 5 |