| MAS-Zero: Designing Multi-Agent Systems with Zero Supervision | May 26, 2025 | MathProblem Decomposition | CodeCode Available | 2 |
| DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving | Jun 18, 2024 | Arithmetic ReasoningMath | CodeCode Available | 2 |
| Agent Lumos: Unified and Modular Training for Open-Source Language Agents | Nov 9, 2023 | MathQuestion Answering | CodeCode Available | 2 |
| Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate | Jan 29, 2025 | Instruction FollowingMath | CodeCode Available | 2 |
| MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems | Apr 6, 2024 | Logical ReasoningMath | CodeCode Available | 2 |
| Cumulative Reasoning with Large Language Models | Aug 8, 2023 | Decision MakingLogical Reasoning | CodeCode Available | 2 |
| AdaptThink: Reasoning Models Can Learn When to Think | May 19, 2025 | Math | CodeCode Available | 2 |
| Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models | Jun 13, 2024 | MathQuantization | CodeCode Available | 2 |
| MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning | Sep 11, 2023 | MathMathematical Reasoning | CodeCode Available | 2 |
| MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark | May 20, 2024 | College MathematicsGSM8K | CodeCode Available | 2 |