| Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains | Jun 2, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 |
| Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning | Jun 2, 2025 | Machine UnlearningMath | CodeCode Available | 0 |
| The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning | Jun 2, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis | Jun 2, 2025 | 8kMath | —Unverified | 0 |
| STORM-BORN: A Challenging Mathematical Derivations Dataset Curated via a Human-in-the-Loop Multi-Agent Framework | Jun 2, 2025 | Math | CodeCode Available | 1 |
| GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking | Jun 1, 2025 | 4kMath | CodeCode Available | 0 |
| SiLVR: A Simple Language-based Video Reasoning Framework | May 30, 2025 | MathMME | CodeCode Available | 1 |
| Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks | May 30, 2025 | Autonomous DrivingMath | CodeCode Available | 1 |
| AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning | May 30, 2025 | GPUMath | CodeCode Available | 7 |
| Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking | May 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |