| A Careful Examination of Large Language Model Performance on Grade School Arithmetic | May 1, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning | Apr 18, 2025 | AllGSM8K | —Unverified | 0 | 0 |
| Uncertainty Aware Learning for Language Model Alignment | Jun 7, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function | Sep 1, 2023 | GSM8KMathematical Reasoning | —Unverified | 0 | 0 |
| Nudging: Inference-time Alignment of LLMs via Guided Decoding | Oct 11, 2024 | General KnowledgeGSM8K | —Unverified | 0 | 0 |
| Fine-Grained Self-Endorsement Improves Factuality and Reasoning | Feb 23, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning | Oct 8, 2024 | GSM8KHallucination | —Unverified | 0 | 0 |
| On Designing Effective RL Reward at Training Time for LLM Reasoning | Oct 19, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Uncertainty-Aware Search and Value Models: Mitigating Search Scaling Flaws in LLMs | Feb 16, 2025 | GSM8KThompson Sampling | —Unverified | 0 | 0 |
| Making Large Language Models Better Reasoners with Step-Aware Verifier | Jun 6, 2022 | Arithmetic ReasoningFew-Shot Learning | —Unverified | 0 | 0 |
| Fast on the Easy, Deep on the Hard: Efficient Reasoning via Powered Length Penalty | Jun 12, 2025 | GSM8K | —Unverified | 0 | 0 |
| Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation | Oct 22, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Orca-Math: Unlocking the potential of SLMs in Grade School Math | Feb 16, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 | 0 |
| Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree | Dec 17, 2024 | GSM8KHumanEval | —Unverified | 0 | 0 |
| Exploring an LM to generate Prolog Predicates from Mathematics Questions | Sep 7, 2023 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| Advancing Process Verification for Large Language Models via Tree-Based Preference Learning | Jun 29, 2024 | Binary ClassificationGSM8K | —Unverified | 0 | 0 |
| Explicit Knowledge Transfer for Weakly-Supervised Code Generation | Nov 30, 2022 | Code GenerationFew-Shot Learning | —Unverified | 0 | 0 |
| PARAMANU-GANITA: Language Model with Mathematical Capabilities | Apr 22, 2024 | Domain AdaptationGSM8K | —Unverified | 0 | 0 |
| Patience Is The Key to Large Language Model Reasoning | Nov 20, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation | Oct 2, 2024 | Data AugmentationDiversity | —Unverified | 0 | 0 |
| Pheromone-based Learning of Optimal Reasoning Paths | Jan 31, 2025 | ARCGSM8K | —Unverified | 0 | 0 |
| Excessive Reasoning Attack on Reasoning LLMs | Jun 17, 2025 | GSM8K | —Unverified | 0 | 0 |
| Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization | Feb 8, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| Plan for Speed -- Dilated Scheduling for Masked Diffusion Language Models | Jun 23, 2025 | Code CompletionGSM8K | —Unverified | 0 | 0 |
| PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language Models | May 22, 2025 | GSM8KLarge Language Model | —Unverified | 0 | 0 |