| MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task | Feb 17, 2025 | Code CompletionGSM8K | —Unverified | 0 |
| Leveraging Uncertainty Estimation for Efficient LLM Routing | Feb 16, 2025 | GSM8KMMLU | —Unverified | 0 |
| Balancing the Budget: Understanding Trade-offs Between Supervised and Preference-Based Finetuning | Feb 16, 2025 | GSM8K | —Unverified | 0 |
| Uncertainty-Aware Search and Value Models: Mitigating Search Scaling Flaws in LLMs | Feb 16, 2025 | GSM8KThompson Sampling | —Unverified | 0 |
| Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls | Feb 16, 2025 | Computational EfficiencyGSM8K | CodeCode Available | 0 |
| Hybrid Offline-online Scheduling Method for Large Language Model Inference Optimization | Feb 14, 2025 | GSM8KInference Optimization | —Unverified | 0 |
| Cost-Saving LLM Cascades with Early Abstention | Feb 13, 2025 | GSM8KMMLU | —Unverified | 0 |
| Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges | Feb 12, 2025 | GSM8KMath | CodeCode Available | 0 |
| Self-Training Large Language Models for Tool-Use Without Demonstrations | Feb 9, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 |
| Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization | Feb 8, 2025 | GSM8KMath | —Unverified | 0 |