| Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence | May 23, 2025 | GPULarge Language Model | —Unverified | 0 |
| RaDeR: Reasoning-aware Dense Retrieval Models | May 23, 2025 | MathMathematical Problem-Solving | CodeCode Available | 1 |
| The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs | May 23, 2025 | Cross-Lingual TransferMath | —Unverified | 0 |
| Amplify Adjacent Token Differences: Enhancing Long Chain-of-Thought Reasoning with Shift-FFN | May 22, 2025 | Mathematical Reasoning | —Unverified | 0 |
| Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms | May 22, 2025 | Adversarial AttackBenchmarking | —Unverified | 0 |
| KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning | May 22, 2025 | Mathematical Reasoningreinforcement-learning | CodeCode Available | 1 |
| SMART: Self-Generating and Self-Validating Multi-Dimensional Assessment for LLMs' Mathematical Problem Solving | May 22, 2025 | DiagnosticMathematical Problem-Solving | —Unverified | 0 |
| HOFT: Householder Orthogonal Fine-tuning | May 22, 2025 | Machine TranslationMathematical Reasoning | —Unverified | 0 |
| Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains | May 22, 2025 | Mathematical ReasoningReinforcement Learning (RL) | —Unverified | 0 |
| EquivPruner: Boosting Efficiency and Quality in LLM-Based Search via Action Pruning | May 22, 2025 | GSM8KMath | CodeCode Available | 0 |