| Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning | Jun 11, 2025 | Image CaptioningMath | CodeCode Available | 2 |
| AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions | Jun 10, 2025 | Math | CodeCode Available | 2 |
| Play to Generalize: Learning to Reason Through Game Play | Jun 9, 2025 | Domain GeneralizationMath | CodeCode Available | 2 |
| MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning | Jun 5, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning | Jun 2, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO | May 28, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 2 |
| Reinforcing General Reasoning without Verifiers | May 27, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing | May 27, 2025 | Math | CodeCode Available | 2 |
| MAS-Zero: Designing Multi-Agent Systems with Zero Supervision | May 26, 2025 | MathProblem Decomposition | CodeCode Available | 2 |
| WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning | May 22, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 2 |