| Learning to chain-of-thought with Jensen's evidence lower bound | Mar 25, 2025 | Mathematical Reasoningreinforcement-learning | —Unverified | 0 |
| LLM Reasoning Engine: Specialized Training for Enhanced Mathematical Reasoning | Dec 28, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Dual Instruction Tuning with Large Language Models for Mathematical Reasoning | Mar 27, 2024 | Domain GeneralizationMathematical Reasoning | —Unverified | 0 |
| LLMs can be easily Confused by Instructional Distractions | Feb 5, 2025 | Bias DetectionCode Generation | —Unverified | 0 |
| Applying RLAIF for Code Generation with API-usage in Lightweight LLMs | Jun 28, 2024 | Code GenerationHallucination | —Unverified | 0 |
| Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation | May 13, 2025 | Imitation LearningMathematical Reasoning | —Unverified | 0 |
| Learning by Applying: A General Framework for Mathematical Reasoning via Enhancing Explicit Knowledge Learning | Feb 11, 2023 | DecoderMathematical Reasoning | —Unverified | 0 |
| DavIR: Data Selection via Implicit Reward for Large Language Models | Oct 16, 2023 | Causal Language ModelingGSM8K | —Unverified | 0 |
| DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models | May 20, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 |
| Mathematical Reasoning in Latent Space | Sep 26, 2019 | Mathematical Reasoning | —Unverified | 0 |