| ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools | Jun 18, 2024 | AllGSM8K | CodeCode Available | 14 |
| Qwen2 Technical Report | Jul 15, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 13 |
| LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning | Mar 26, 2024 | GPUGSM8K | CodeCode Available | 9 |
| LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression | Mar 19, 2024 | GSM8KLanguage Modelling | CodeCode Available | 9 |
| Qwen2.5-Omni Technical Report | Mar 26, 2025 | Automatic Speech Recognition (ASR)GSM8K | CodeCode Available | 7 |
| Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training | May 23, 2024 | GSM8KMixture-of-Experts | CodeCode Available | 7 |
| Chain-of-Thought Prompting Elicits Reasoning in Large Language Models | Jan 28, 2022 | Common Sense ReasoningGSM8K | CodeCode Available | 6 |
| Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B | Jun 11, 2024 | Decision MakingGSM8K | CodeCode Available | 5 |
| Common 7B Language Models Already Possess Strong Math Capabilities | Mar 7, 2024 | GSM8KMath | CodeCode Available | 5 |
| LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models | Oct 9, 2023 | GSM8KIn-Context Learning | CodeCode Available | 5 |
| WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct | Aug 18, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 5 |
| SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator | Dec 16, 2024 | GSM8KLanguage Modeling | CodeCode Available | 4 |
| SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights | Oct 11, 2024 | GSM8KMath | CodeCode Available | 4 |
| Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers | Aug 12, 2024 | GSM8KMath | CodeCode Available | 4 |
| Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking | Mar 14, 2024 | GSM8KLanguage Modelling | CodeCode Available | 4 |
| OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset | Feb 15, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 4 |
| InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning | Feb 9, 2024 | Data AugmentationGSM8K | CodeCode Available | 4 |
| ReFT: Reasoning with Reinforced Fine-Tuning | Jan 17, 2024 | GSM8KMath | CodeCode Available | 4 |
| Baichuan 2: Open Large-scale Language Models | Sep 19, 2023 | Feature EngineeringGSM8K | CodeCode Available | 4 |
| Thinkless: LLM Learns When to Think | May 19, 2025 | GSM8KMath | CodeCode Available | 3 |
| Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution | Apr 13, 2025 | GSM8KMath | CodeCode Available | 3 |
| TokenSkip: Controllable Chain-of-Thought Compression in LLMs | Feb 17, 2025 | GSM8K | CodeCode Available | 3 |
| Scaling up Masked Diffusion Models on Text | Oct 24, 2024 | GSM8KLanguage Modeling | CodeCode Available | 3 |
| Large Language Monkeys: Scaling Inference Compute with Repeated Sampling | Jul 31, 2024 | GSM8KMath | CodeCode Available | 3 |
| LoRA-GA: Low-Rank Adaptation with Gradient Approximation | Jul 6, 2024 | GSM8Kparameter-efficient fine-tuning | CodeCode Available | 3 |
| Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs | Jun 26, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 3 |
| Automatic Instruction Evolving for Large Language Models | Jun 2, 2024 | GSM8KHumanEval | CodeCode Available | 3 |
| From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step | May 23, 2024 | GSM8K | CodeCode Available | 3 |
| MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning | May 13, 2024 | Data AugmentationGSM8K | CodeCode Available | 3 |
| Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning | May 1, 2024 | ARCGSM8K | CodeCode Available | 3 |
| LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding | Apr 25, 2024 | GSM8KHellaSwag | CodeCode Available | 3 |
| PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models | Apr 3, 2024 | GSM8KQuantization | CodeCode Available | 3 |
| MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline | Jan 16, 2024 | GSM8KMath | CodeCode Available | 3 |
| SkyMath: Technical Report | Oct 25, 2023 | GSM8KLanguage Modeling | CodeCode Available | 3 |
| Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models | May 26, 2023 | GSM8KMultimodal Reasoning | CodeCode Available | 3 |
| PAL: Program-aided Language Models | Nov 18, 2022 | Arithmetic ReasoningGSM8K | CodeCode Available | 3 |
| Training Verifiers to Solve Math Word Problems | Oct 27, 2021 | GSM8KMath | CodeCode Available | 3 |
| any4: Learned 4-bit Numeric Representation for LLMs | Jul 7, 2025 | GPUGSM8K | CodeCode Available | 2 |
| Let LLMs Break Free from Overthinking via Self-Braking Tuning | May 20, 2025 | GSM8K | CodeCode Available | 2 |
| Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space | May 19, 2025 | GSM8KMath | CodeCode Available | 2 |
| Synthetic Data RL: Task Definition Is All You Need | May 18, 2025 | AllGSM8K | CodeCode Available | 2 |
| SLOT: Sample-specific Language Model Optimization at Test-time | May 18, 2025 | GSM8KLanguage Modeling | CodeCode Available | 2 |
| Dynamic Early Exit in Reasoning Models | Apr 22, 2025 | GSM8KMath | CodeCode Available | 2 |
| SEAL: Steerable Reasoning Calibration of Large Language Models for Free | Apr 7, 2025 | GSM8K | CodeCode Available | 2 |
| CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models | Mar 28, 2025 | GPUGSM8K | CodeCode Available | 2 |
| Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models | Mar 21, 2025 | GSM8KQuestion Answering | CodeCode Available | 2 |
| Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models | Feb 24, 2025 | GSM8KMath | CodeCode Available | 2 |
| SIFT: Grounding LLM Reasoning in Contexts via Stickers | Feb 19, 2025 | GSM8KMath | CodeCode Available | 2 |
| CoT-Valve: Length-Compressible Chain-of-Thought Tuning | Feb 13, 2025 | GSM8K | CodeCode Available | 2 |
| Natural Language Fine-Tuning | Dec 29, 2024 | GSM8KLarge Language Model | CodeCode Available | 2 |