| FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving | Feb 27, 2025 | GSM8KMath | CodeCode Available | 1 |
| Self-Training Elicits Concise Reasoning in Large Language Models | Feb 27, 2025 | GSM8KIn-Context Learning | CodeCode Available | 1 |
| Distill Not Only Data but Also Rewards: Can Smaller Language Models Surpass Larger Ones? | Feb 26, 2025 | GSM8KMMLU | —Unverified | 0 |
| Weaker LLMs' Opinions Also Matter: Mixture of Opinions Enhances LLM's Mathematical Reasoning | Feb 26, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 |
| SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models | Feb 25, 2025 | Continual LearningGSM8K | —Unverified | 0 |
| LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint | Feb 24, 2025 | GSM8K | —Unverified | 0 |
| Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models | Feb 24, 2025 | GSM8KMath | CodeCode Available | 2 |
| Dynamic Parallel Tree Search for Efficient LLM Reasoning | Feb 22, 2025 | Computational EfficiencyGSM8K | —Unverified | 0 |
| Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective | Feb 20, 2025 | GSM8KMath | CodeCode Available | 0 |
| NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language Models | Feb 20, 2025 | GSM8KNatural Language Understanding | CodeCode Available | 0 |