| Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting | Feb 5, 2025 | GSM8KMath | CodeCode Available | 0 |
| Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment | Feb 5, 2025 | GSM8KHumanEval | —Unverified | 0 |
| BARE: Leveraging Base Language Models for Few-Shot Synthetic Data Generation | Feb 3, 2025 | DiversityGSM8K | —Unverified | 0 |
| ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference | Feb 1, 2025 | GPUGSM8K | —Unverified | 0 |
| Pheromone-based Learning of Optimal Reasoning Paths | Jan 31, 2025 | ARCGSM8K | —Unverified | 0 |
| RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations | Jan 25, 2025 | Computational EfficiencyGSM8K | —Unverified | 0 |
| Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs | Jan 21, 2025 | GSM8KIn-Context Learning | —Unverified | 0 |
| DNA 1.0 Technical Report | Jan 18, 2025 | BelebeleGSM8K | —Unverified | 0 |
| ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving | Jan 14, 2025 | GSM8KMath | CodeCode Available | 0 |
| DiscQuant: A Quantization Method for Neural Networks Inspired by Discrepancy Theory | Jan 11, 2025 | GSM8KQuantization | CodeCode Available | 0 |
| Semantic Exploration with Adaptive Gating for Efficient Problem Solving with Language Models | Jan 10, 2025 | ARCDiversity | —Unverified | 0 |
| InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion | Jan 6, 2025 | GSM8KHumanEval | —Unverified | 0 |
| Recursive Decomposition of Logical Thoughts: Framework for Superior Reasoning and Knowledge Propagation in Large Language Models | Jan 3, 2025 | GSM8KMath | —Unverified | 0 |
| DIVE: Diversified Iterative Self-Improvement | Jan 1, 2025 | DiversityGSM8K | CodeCode Available | 0 |
| Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs | Dec 30, 2024 | GSM8K | —Unverified | 0 |
| LLM2: Let Large Language Models Harness System 2 Reasoning | Dec 29, 2024 | GSM8KMathematical Reasoning | CodeCode Available | 0 |
| Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning | Dec 23, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| System-2 Mathematical Reasoning via Enriched Instruction Tuning | Dec 22, 2024 | ERPGSM8K | —Unverified | 0 |
| Ask-Before-Detection: Identifying and Mitigating Conformity Bias in LLM-Powered Error Detector for Math Word Problem Solutions | Dec 22, 2024 | GSM8KMath | —Unverified | 0 |
| Inference Scaling vs Reasoning: An Empirical Analysis of Compute-Optimal LLM Problem-Solving | Dec 20, 2024 | Computational EfficiencyGSM8K | CodeCode Available | 0 |
| Enhancing Knowledge Distillation for LLMs with Response-Priming Prompting | Dec 18, 2024 | GSM8KKnowledge Distillation | CodeCode Available | 0 |
| Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree | Dec 17, 2024 | GSM8KHumanEval | —Unverified | 0 |
| A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions | Dec 12, 2024 | GSM8KKnowledge Graphs | —Unverified | 0 |
| Learning to Reason via Self-Iterative Process Feedback for Small Language Models | Dec 11, 2024 | Domain GeneralizationGSM8K | —Unverified | 0 |
| SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs | Dec 11, 2024 | ARCGSM8K | —Unverified | 0 |