| ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning | Oct 24, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment | Feb 5, 2025 | GSM8KHumanEval | —Unverified | 0 | 0 |
| Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths | Oct 7, 2024 | AttributeGSM8K | —Unverified | 0 | 0 |
| Reasoning Robustness of LLMs to Adversarial Typographical Errors | Nov 8, 2024 | GSM8KMMLU | —Unverified | 0 | 0 |
| Unlocking Structured Thinking in Language Models with Cognitive Prompting | Oct 3, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 | 0 |
| Recursive Decomposition of Logical Thoughts: Framework for Superior Reasoning and Knowledge Propagation in Large Language Models | Jan 3, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| Dynamic Parallel Tree Search for Efficient LLM Reasoning | Feb 22, 2025 | Computational EfficiencyGSM8K | —Unverified | 0 | 0 |
| Dual Decomposition of Weights and Singular Value Low Rank Adaptation | May 20, 2025 | GSM8KMMLU | —Unverified | 0 | 0 |
| Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models | May 15, 2025 | Code GenerationGSM8K | —Unverified | 0 | 0 |
| Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures | Nov 25, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models | Apr 18, 2024 | GSM8KMMLU | —Unverified | 0 | 0 |
| Relevant or Random: Can LLMs Truly Perform Analogical Reasoning? | Apr 19, 2024 | GSM8K | —Unverified | 0 | 0 |
| Reliable Reasoning Beyond Natural Language | Jul 16, 2024 | GSM8KMathematical Reasoning | —Unverified | 0 | 0 |
| Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation | Oct 27, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models | May 20, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 | 0 |
| RevOrder: A Novel Method for Enhanced Arithmetic in Language Models | Feb 6, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs | Dec 30, 2024 | GSM8K | —Unverified | 0 | 0 |
| RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs | May 19, 2025 | GSM8K | —Unverified | 0 | 0 |
| RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations | Jan 25, 2025 | Computational EfficiencyGSM8K | —Unverified | 0 | 0 |
| Rule-Guided Feedback: Enhancing Reasoning by Enforcing Rule Adherence in Large Language Models | Mar 14, 2025 | Checkmate In OneGSM8K | —Unverified | 0 | 0 |
| S^3c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners | Sep 3, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Does your data spark joy? Performance gains from domain upsampling at the end of training | Jun 5, 2024 | GSM8KHumanEval | —Unverified | 0 | 0 |
| SAIE Framework: Support Alone Isn't Enough -- Advancing LLM Training with Adversarial Remarks | Nov 14, 2023 | GSM8KMath | —Unverified | 0 | 0 |
| Sample, Don't Search: Rethinking Test-Time Alignment for Language Models | Apr 4, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 | 0 |
| Unsupervised Elicitation of Language Models | Jun 11, 2025 | GSM8KTruthfulQA | —Unverified | 0 | 0 |
| DNA 1.0 Technical Report | Jan 18, 2025 | BelebeleGSM8K | —Unverified | 0 | 0 |
| UPAR: A Kantian-Inspired Prompting Framework for Enhancing Large Language Model Capabilities | Sep 30, 2023 | Causal JudgmentGSM8K | —Unverified | 0 | 0 |
| DiversiGATE: A Comprehensive Framework for Reliable Large Language Models | Jun 22, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 | 0 |
| YODA: Teacher-Student Progressive Learning for Language Models | Jan 28, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Distill Not Only Data but Also Rewards: Can Smaller Language Models Surpass Larger Ones? | Feb 26, 2025 | GSM8KMMLU | —Unverified | 0 | 0 |
| SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models | Feb 25, 2025 | Continual LearningGSM8K | —Unverified | 0 | 0 |
| Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models | Nov 2, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Dialectical Behavior Therapy Approach to LLM Prompting | Oct 10, 2024 | GSM8KStrategyQA | —Unverified | 0 | 0 |
| Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation | Oct 3, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models | Aug 16, 2024 | GSM8KMMLU | —Unverified | 0 | 0 |
| D^2LoRA: Data-Driven LoRA Initialization for Low Resource Tasks | Mar 23, 2025 | GSM8K | —Unverified | 0 | 0 |
| Self-Consistency Boosts Calibration for Math Reasoning | Mar 14, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic | Aug 29, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| Self-Consistency Preference Optimization | Nov 6, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Self-Evaluation Guided Beam Search for Reasoning | May 1, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 | 0 |
| Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models | Mar 4, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks | Sep 13, 2024 | ARCCode Generation | —Unverified | 0 | 0 |
| Self-Imagine: Effective Unimodal Reasoning with Multimodal Models using Self-Imagination | Jan 16, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| CoThink: Token-Efficient Reasoning via Instruct Models Guiding Reasoning Models | May 28, 2025 | GSM8K | —Unverified | 0 | 0 |
| Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst | May 20, 2025 | ARCGSM8K | —Unverified | 0 | 0 |
| Cost-Saving LLM Cascades with Early Abstention | Feb 13, 2025 | GSM8KMMLU | —Unverified | 0 | 0 |
| Self-Training Large Language Models for Tool-Use Without Demonstrations | Feb 9, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 | 0 |
| Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs | May 19, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 | 0 |
| Semantic Exploration with Adaptive Gating for Efficient Problem Solving with Language Models | Jan 10, 2025 | ARCDiversity | —Unverified | 0 | 0 |
| CoRE: Enhancing Metacognition with Label-free Self-evaluation in LRMs | Jul 8, 2025 | GSM8KMath | —Unverified | 0 | 0 |