| Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning | Oct 18, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback | Jan 18, 2025 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Step-wise Adaptive Integration of Supervised Fine-tuning and Reinforcement Learning for Task-Specific LLMs | May 19, 2025 | Mathematical ReasoningReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Subtle Errors Matter: Preference Learning via Error-injected Self-editing | Oct 9, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Supervised Optimism Correction: Be Confident When LLMs Are Sure | Apr 10, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| Sustainability of Collusion and Market Transparency in a Sequential Search Market: a Generalization | May 5, 2021 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models | Feb 20, 2024 | Instruction FollowingLogical Reasoning | —Unverified | 0 | 0 |
| Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use | Apr 7, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| System-2 Mathematical Reasoning via Enriched Instruction Tuning | Dec 22, 2024 | ERPGSM8K | —Unverified | 0 | 0 |
| Table as Thought: Exploring Structured Thoughts in LLM Reasoning | Jan 4, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |