| Enhancing Knowledge Distillation for LLMs with Response-Priming Prompting | Dec 18, 2024 | GSM8KKnowledge Distillation | CodeCode Available | 0 |
| Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree | Dec 17, 2024 | GSM8KHumanEval | —Unverified | 0 |
| SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator | Dec 16, 2024 | GSM8KLanguage Modeling | CodeCode Available | 4 |
| Entropy-Regularized Process Reward Model | Dec 15, 2024 | GSM8KMath | CodeCode Available | 1 |
| GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers | Dec 12, 2024 | GSM8KPrompt Engineering | CodeCode Available | 1 |
| Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries | Dec 12, 2024 | 4kGSM8K | CodeCode Available | 1 |
| A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions | Dec 12, 2024 | GSM8KKnowledge Graphs | —Unverified | 0 |
| SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs | Dec 11, 2024 | ARCGSM8K | —Unverified | 0 |
| Learning to Reason via Self-Iterative Process Feedback for Small Language Models | Dec 11, 2024 | Domain GeneralizationGSM8K | —Unverified | 0 |
| ProcessBench: Identifying Process Errors in Mathematical Reasoning | Dec 9, 2024 | GSM8KMath | CodeCode Available | 2 |
| Evolutionary Pre-Prompt Optimization for Mathematical Reasoning | Dec 5, 2024 | Few-Shot LearningGSM8K | —Unverified | 0 |
| How to Correctly do Semantic Backpropagation on Language-based Agentic Systems | Dec 4, 2024 | GSM8K | CodeCode Available | 2 |
| Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning | Dec 4, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| MALT: Improving Reasoning with Multi-Agent LLM Training | Dec 2, 2024 | Common Sense ReasoningGSM8K | —Unverified | 0 |
| Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability | Nov 29, 2024 | GSM8KMath | CodeCode Available | 1 |
| Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference | Nov 27, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| Predicting Emergent Capabilities by Finetuning | Nov 25, 2024 | CoLAGSM8K | —Unverified | 0 |
| Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures | Nov 25, 2024 | GSM8KMath | —Unverified | 0 |
| Preference Optimization for Reasoning with Pseudo Feedback | Nov 25, 2024 | GSM8KMath | CodeCode Available | 2 |
| Patience Is The Key to Large Language Model Reasoning | Nov 20, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| Adaptive Decoding via Latent Preference Optimization | Nov 14, 2024 | GSM8KInstruction Following | —Unverified | 0 |
| Dynamic Subset Tuning: Expanding the Operational Range of Parameter-Efficient Training for Large Language Models | Nov 13, 2024 | GSM8K | —Unverified | 0 |
| What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? | Nov 12, 2024 | GSM8KMath | CodeCode Available | 1 |
| UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts | Nov 11, 2024 | Code GenerationGSM8K | CodeCode Available | 1 |
| Quasi-random Multi-Sample Inference for Large Language Models | Nov 9, 2024 | DiversityGSM8K | —Unverified | 0 |