| Enhancing Knowledge Distillation for LLMs with Response-Priming Prompting | Dec 18, 2024 | GSM8KKnowledge Distillation | CodeCode Available | 0 |
| Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree | Dec 17, 2024 | GSM8KHumanEval | —Unverified | 0 |
| SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator | Dec 16, 2024 | GSM8KLanguage Modeling | CodeCode Available | 4 |
| Entropy-Regularized Process Reward Model | Dec 15, 2024 | GSM8KMath | CodeCode Available | 1 |
| GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers | Dec 12, 2024 | GSM8KPrompt Engineering | CodeCode Available | 1 |
| Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries | Dec 12, 2024 | 4kGSM8K | CodeCode Available | 1 |
| A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions | Dec 12, 2024 | GSM8KKnowledge Graphs | —Unverified | 0 |
| SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs | Dec 11, 2024 | ARCGSM8K | —Unverified | 0 |
| Learning to Reason via Self-Iterative Process Feedback for Small Language Models | Dec 11, 2024 | Domain GeneralizationGSM8K | —Unverified | 0 |
| ProcessBench: Identifying Process Errors in Mathematical Reasoning | Dec 9, 2024 | GSM8KMath | CodeCode Available | 2 |
| Evolutionary Pre-Prompt Optimization for Mathematical Reasoning | Dec 5, 2024 | Few-Shot LearningGSM8K | —Unverified | 0 |
| How to Correctly do Semantic Backpropagation on Language-based Agentic Systems | Dec 4, 2024 | GSM8K | CodeCode Available | 2 |
| Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning | Dec 4, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| MALT: Improving Reasoning with Multi-Agent LLM Training | Dec 2, 2024 | Common Sense ReasoningGSM8K | —Unverified | 0 |
| Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability | Nov 29, 2024 | GSM8KMath | CodeCode Available | 1 |
| Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference | Nov 27, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| Predicting Emergent Capabilities by Finetuning | Nov 25, 2024 | CoLAGSM8K | —Unverified | 0 |
| Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures | Nov 25, 2024 | GSM8KMath | —Unverified | 0 |
| Preference Optimization for Reasoning with Pseudo Feedback | Nov 25, 2024 | GSM8KMath | CodeCode Available | 2 |
| Patience Is The Key to Large Language Model Reasoning | Nov 20, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| Adaptive Decoding via Latent Preference Optimization | Nov 14, 2024 | GSM8KInstruction Following | —Unverified | 0 |
| Dynamic Subset Tuning: Expanding the Operational Range of Parameter-Efficient Training for Large Language Models | Nov 13, 2024 | GSM8K | —Unverified | 0 |
| What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? | Nov 12, 2024 | GSM8KMath | CodeCode Available | 1 |
| UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts | Nov 11, 2024 | Code GenerationGSM8K | CodeCode Available | 1 |
| Quasi-random Multi-Sample Inference for Large Language Models | Nov 9, 2024 | DiversityGSM8K | —Unverified | 0 |
| Reasoning Robustness of LLMs to Adversarial Typographical Errors | Nov 8, 2024 | GSM8KMMLU | —Unverified | 0 |
| Kwai-STaR: Transform LLMs into State-Transition Reasoners | Nov 7, 2024 | GSM8KMathematical Problem-Solving | —Unverified | 0 |
| Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding | Nov 6, 2024 | ARCGSM8K | CodeCode Available | 2 |
| Self-Consistency Preference Optimization | Nov 6, 2024 | GSM8KMath | —Unverified | 0 |
| Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models | Nov 2, 2024 | GSM8KMath | —Unverified | 0 |
| Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation | Oct 27, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization | Oct 27, 2024 | GSM8KHellaSwag | CodeCode Available | 1 |
| ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning | Oct 24, 2024 | GSM8KMath | —Unverified | 0 |
| Scaling up Masked Diffusion Models on Text | Oct 24, 2024 | GSM8KLanguage Modeling | CodeCode Available | 3 |
| Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment | Oct 23, 2024 | GSM8KHumanEval | —Unverified | 0 |
| Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes | Oct 22, 2024 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation | Oct 22, 2024 | GSM8KMath | —Unverified | 0 |
| SMART: Self-learning Meta-strategy Agent for Reasoning Tasks | Oct 21, 2024 | GSM8KSelf-Learning | CodeCode Available | 0 |
| On Designing Effective RL Reward at Training Time for LLM Reasoning | Oct 19, 2024 | GSM8KMath | —Unverified | 0 |
| TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling | Oct 18, 2024 | Computational EfficiencyGSM8K | —Unverified | 0 |
| SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation | Oct 17, 2024 | GSM8KLanguage Modeling | CodeCode Available | 0 |
| Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning | Oct 16, 2024 | AllGSM8K | CodeCode Available | 0 |
| MIND: Math Informed syNthetic Dialogues for Pretraining LLMs | Oct 15, 2024 | GSM8KMath | —Unverified | 0 |
| One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks | Oct 14, 2024 | FairnessGSM8K | CodeCode Available | 0 |
| How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective | Oct 14, 2024 | Density Ratio EstimationGSM8K | CodeCode Available | 0 |
| COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement | Oct 12, 2024 | Code GenerationComputational Efficiency | CodeCode Available | 0 |
| Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization | Oct 11, 2024 | GSM8KLanguage Modeling | CodeCode Available | 2 |
| Nudging: Inference-time Alignment of LLMs via Guided Decoding | Oct 11, 2024 | General KnowledgeGSM8K | —Unverified | 0 |
| SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights | Oct 11, 2024 | GSM8KMath | CodeCode Available | 4 |
| Towards Multilingual LLM Evaluation for European Languages | Oct 11, 2024 | ARCGSM8K | —Unverified | 0 |