| LoRA-GA: Low-Rank Adaptation with Gradient Approximation | Jul 6, 2024 | GSM8Kparameter-efficient fine-tuning | CodeCode Available | 3 | 5 |
| TokenSkip: Controllable Chain-of-Thought Compression in LLMs | Feb 17, 2025 | GSM8K | CodeCode Available | 3 | 5 |
| PAL: Program-aided Language Models | Nov 18, 2022 | Arithmetic ReasoningGSM8K | CodeCode Available | 3 | 5 |
| Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution | Apr 13, 2025 | GSM8KMath | CodeCode Available | 3 | 5 |
| Training Verifiers to Solve Math Word Problems | Oct 27, 2021 | GSM8KMath | CodeCode Available | 3 | 5 |
| Scaling up Masked Diffusion Models on Text | Oct 24, 2024 | GSM8KLanguage Modeling | CodeCode Available | 3 | 5 |
| Large Language Monkeys: Scaling Inference Compute with Repeated Sampling | Jul 31, 2024 | GSM8KMath | CodeCode Available | 3 | 5 |
| SkyMath: Technical Report | Oct 25, 2023 | GSM8KLanguage Modeling | CodeCode Available | 3 | 5 |
| Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models | May 26, 2023 | GSM8KMultimodal Reasoning | CodeCode Available | 3 | 5 |
| PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models | Apr 3, 2024 | GSM8KQuantization | CodeCode Available | 3 | 5 |
| LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding | Apr 25, 2024 | GSM8KHellaSwag | CodeCode Available | 3 | 5 |
| Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs | Jun 26, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 3 | 5 |
| Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models | Oct 10, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| How to Correctly do Semantic Backpropagation on Language-based Agentic Systems | Dec 4, 2024 | GSM8K | CodeCode Available | 2 | 5 |
| Natural Language Fine-Tuning | Dec 29, 2024 | GSM8KLarge Language Model | CodeCode Available | 2 | 5 |
| Offline Reinforcement Learning for LLM Multi-Step Reasoning | Dec 20, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts | Feb 12, 2024 | Continual PretrainingGSM8K | CodeCode Available | 2 | 5 |
| MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models | Sep 21, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 | 5 |
| GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers | Feb 29, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| Meta Prompting for AI Systems | Nov 20, 2023 | Data InteractionGSM8K | CodeCode Available | 2 | 5 |
| Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process | Jul 29, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models | Mar 21, 2025 | GSM8KQuestion Answering | CodeCode Available | 2 | 5 |
| LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters | May 27, 2024 | BenchmarkingGSM8K | CodeCode Available | 2 | 5 |
| CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models | Mar 28, 2025 | GPUGSM8K | CodeCode Available | 2 | 5 |
| Let LLMs Break Free from Overthinking via Self-Braking Tuning | May 20, 2025 | GSM8K | CodeCode Available | 2 | 5 |