| Learning to Rank Chain-of-Thought: An Energy-Based Approach with Outcome Supervision | May 21, 2025 | GSM8KLearning-To-Rank | —Unverified | 0 |
| SSR: Speculative Parallel Scaling Reasoning in Test-time | May 21, 2025 | DiversityMath | —Unverified | 0 |
| MAPS: A Multilingual Benchmark for Global Agent Performance and Security | May 21, 2025 | Code GenerationMath | —Unverified | 0 |
| Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning | May 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation | May 20, 2025 | Mathematical Reasoning | CodeCode Available | 0 |
| Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning | May 20, 2025 | Large Language ModelMathematical Reasoning | —Unverified | 0 |
| AAPO: Enhance the Reasoning Capabilities of LLMs with Advantage Momentum | May 20, 2025 | Mathematical ReasoningReinforcement Learning (RL) | —Unverified | 0 |
| Text Generation Beyond Discrete Token Sampling | May 20, 2025 | Code GenerationMathematical Reasoning | —Unverified | 0 |
| OSoRA: Output-Dimension and Singular-Value Initialized Low-Rank Adaptation | May 20, 2025 | Common Sense ReasoningMathematical Reasoning | —Unverified | 0 |
| DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models | May 20, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 |