| The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve? | Feb 24, 2025 | Arithmetic ReasoningCommon Sense Reasoning | —Unverified | 0 |
| Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights | Feb 18, 2025 | Arithmetic ReasoningCommon Sense Reasoning | —Unverified | 0 |
| On Representational Dissociation of Language and Arithmetic in Large Language Models | Feb 17, 2025 | Arithmetic Reasoning | —Unverified | 0 |
| Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding | Feb 17, 2025 | Arithmetic ReasoningChart Understanding | —Unverified | 0 |
| Can LLMs Maintain Fundamental Abilities under KV Cache Compression? | Feb 4, 2025 | Arithmetic ReasoningCode Generation | —Unverified | 0 |
| CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization | Jan 30, 2025 | Arithmetic ReasoningText Generation | —Unverified | 0 |
| SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training | Jan 28, 2025 | Arithmetic ReasoningMemorization | —Unverified | 0 |
| DoTA: Weight-Decomposed Tensor Adaptation for Large Language Models | Dec 30, 2024 | Arithmetic ReasoningQuantization | —Unverified | 0 |
| Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning | Dec 23, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs | Dec 19, 2024 | Arithmetic ReasoningCode Generation | —Unverified | 0 |
| Hint Marginalization for Improved Reasoning in Large Language Models | Dec 17, 2024 | Arithmetic Reasoning | —Unverified | 0 |
| GaLore+: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection | Dec 15, 2024 | Arithmetic ReasoningText Generation | —Unverified | 0 |
| S^2FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity | Dec 9, 2024 | Arithmetic Reasoning | —Unverified | 0 |
| Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Arithmetic Reasoning | Dec 2, 2024 | Arithmetic Reasoning | —Unverified | 0 |
| PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model | Nov 12, 2024 | Arithmetic ReasoningMixture-of-Experts | —Unverified | 0 |
| Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning | Nov 4, 2024 | Arithmetic ReasoningDecoder | CodeCode Available | 0 |
| Think Beyond Size: Adaptive Prompting for More Effective Reasoning | Oct 10, 2024 | Arithmetic ReasoningComputational Efficiency | —Unverified | 0 |
| Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models | Oct 10, 2024 | Arithmetic ReasoningMath | CodeCode Available | 0 |
| Unlocking Structured Thinking in Language Models with Cognitive Prompting | Oct 3, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Small Language Models are Equation Reasoners | Sep 19, 2024 | Arithmetic ReasoningKnowledge Distillation | —Unverified | 0 |
| 3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability | Aug 28, 2024 | Arithmetic ReasoningGPU | CodeCode Available | 0 |
| Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks | Jul 25, 2024 | Arithmetic Reasoning | —Unverified | 0 |
| Leveraging LLM Reasoning Enhances Personalized Recommender Systems | Jul 22, 2024 | Arithmetic ReasoningRecommendation Systems | —Unverified | 0 |
| Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together | Jul 15, 2024 | Arithmetic ReasoningLanguage Modeling | —Unverified | 0 |
| Self-training Language Models for Arithmetic Reasoning | Jul 11, 2024 | Arithmetic Reasoning | CodeCode Available | 0 |
| SBoRA: Low-Rank Adaptation with Regional Weight Updates | Jul 7, 2024 | Arithmetic Reasoningparameter-efficient fine-tuning | CodeCode Available | 0 |
| Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback | Jun 25, 2024 | Arithmetic ReasoningRelation | CodeCode Available | 0 |
| Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models | Jun 6, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 0 |
| Arithmetic Reasoning with LLM: Prolog Generation & Permutation | May 28, 2024 | Arithmetic ReasoningData Augmentation | —Unverified | 0 |
| Large Language Models Can Self-Correct with Key Condition Verification | May 23, 2024 | Arithmetic ReasoningMath | —Unverified | 0 |
| Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs | May 21, 2024 | Arithmetic ReasoningDecision Making | —Unverified | 0 |
| Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment | May 6, 2024 | Arithmetic ReasoningCode Generation | —Unverified | 0 |
| Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM | Mar 12, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 0 |
| The Claude 3 Model Family: Opus, Sonnet, Haiku | Mar 4, 2024 | 1 Image, 2*2 StitchingArithmetic Reasoning | —Unverified | 0 |
| SymBa: Symbolic Backward Chaining for Structured Natural Language Reasoning | Feb 20, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering | Feb 17, 2024 | Arithmetic ReasoningMathematical Reasoning | —Unverified | 0 |
| Orca-Math: Unlocking the potential of SLMs in Grade School Math | Feb 16, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Exploring Group and Symmetry Principles in Large Language Models | Feb 9, 2024 | Arithmetic ReasoningNegation | —Unverified | 0 |
| The Unreasonable Effectiveness of Eccentric Automatic Prompts | Feb 9, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting | Jan 28, 2024 | Arithmetic ReasoningFact Checking | —Unverified | 0 |
| Large Language Models are Null-Shot Learners | Jan 16, 2024 | Arithmetic ReasoningBenchmarking | —Unverified | 0 |
| LLM Augmented LLMs: Expanding Capabilities through Composition | Jan 4, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 0 |
| TinyGSM: achieving >80% on GSM8k with small language models | Dec 14, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning | Dec 14, 2023 | Arithmetic ReasoningFew-Shot Learning | —Unverified | 0 |
| Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning | Dec 9, 2023 | Arithmetic ReasoningMathematical Reasoning | CodeCode Available | 0 |
| ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions | Dec 4, 2023 | Arithmetic ReasoningMath | CodeCode Available | 0 |
| Orca 2: Teaching Small Language Models How to Reason | Nov 18, 2023 | Arithmetic ReasoningCommon Sense Reasoning | —Unverified | 0 |
| The ART of LLM Refinement: Ask, Refine, and Trust | Nov 14, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Prompt Sketching for Large Language Models | Nov 8, 2023 | Arithmetic ReasoningBenchmarking | —Unverified | 0 |
| KwaiYiiMath: Technical Report | Oct 11, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |