| Balancing LoRA Performance and Efficiency with Simple Shard Sharing | Sep 19, 2024 | Computational EfficiencyGSM8K | CodeCode Available | 2 |
| LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning | Sep 19, 2024 | GSM8KLogical Reasoning | CodeCode Available | 0 |
| Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement | Sep 18, 2024 | GSM8KMath | —Unverified | 0 |
| Improving LLM Reasoning with Multi-Agent Tree-of-Thought Validator Agent | Sep 17, 2024 | GSM8KQuestion Answering | CodeCode Available | 1 |
| CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks | Sep 13, 2024 | ARCCode Generation | —Unverified | 0 |
| STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning | Sep 10, 2024 | GSM8KMixture-of-Experts | —Unverified | 0 |
| Strategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitation | Sep 5, 2024 | GSM8K | —Unverified | 0 |
| Prompt Baking | Sep 4, 2024 | ARCGSM8K | —Unverified | 0 |
| CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models | Sep 4, 2024 | GSM8KMath | CodeCode Available | 2 |
| Building Math Agents with Multi-Turn Iterative Preference Learning | Sep 4, 2024 | GSM8KMath | —Unverified | 0 |
| S^3c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners | Sep 3, 2024 | GSM8KMath | —Unverified | 0 |
| Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems | Aug 29, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic | Aug 29, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models | Aug 28, 2024 | Data AugmentationGSM8K | —Unverified | 0 |
| SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models | Aug 21, 2024 | 8kGSM8K | CodeCode Available | 1 |
| Threshold Filtering Packing for Supervised Fine-Tuning: Training Related Samples within Packs | Aug 18, 2024 | DiversityGPU | —Unverified | 0 |
| SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models | Aug 16, 2024 | GSM8KMMLU | —Unverified | 0 |
| Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers | Aug 12, 2024 | GSM8KMath | CodeCode Available | 4 |
| Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational Curricula | Aug 8, 2024 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Large Language Monkeys: Scaling Inference Compute with Repeated Sampling | Jul 31, 2024 | GSM8KMath | CodeCode Available | 3 |
| Cool-Fusion: Fuse Large Language Models without Training | Jul 29, 2024 | Combinatorial OptimizationGSM8K | —Unverified | 0 |
| Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process | Jul 29, 2024 | GSM8KMath | CodeCode Available | 2 |
| Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost | Jul 29, 2024 | GSM8KPrompt Engineering | —Unverified | 0 |
| Learning Goal-Conditioned Representations for Language Reward Models | Jul 18, 2024 | GSM8KMath | CodeCode Available | 1 |
| Weak-to-Strong Reasoning | Jul 18, 2024 | GSM8KMath | CodeCode Available | 2 |