| Rectified Sparse Attention | Jun 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| OpenThoughts: Data Recipes for Reasoning Models | Jun 4, 2025 | Math | CodeCode Available | 7 |
| Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models | Jun 4, 2025 | Math | CodeCode Available | 1 |
| MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching | Jun 3, 2025 | Data AugmentationInstruction Following | —Unverified | 0 |
| Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem | Jun 3, 2025 | GPUMath | —Unverified | 0 |
| Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains | Jun 2, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 |
| Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning | Jun 2, 2025 | Machine UnlearningMath | CodeCode Available | 0 |
| The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning | Jun 2, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis | Jun 2, 2025 | 8kMath | —Unverified | 0 |
| STORM-BORN: A Challenging Mathematical Derivations Dataset Curated via a Human-in-the-Loop Multi-Agent Framework | Jun 2, 2025 | Math | CodeCode Available | 1 |
| GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking | Jun 1, 2025 | 4kMath | CodeCode Available | 0 |
| SiLVR: A Simple Language-based Video Reasoning Framework | May 30, 2025 | MathMME | CodeCode Available | 1 |
| Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks | May 30, 2025 | Autonomous DrivingMath | CodeCode Available | 1 |
| Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models | May 30, 2025 | MathMultiple-choice | CodeCode Available | 0 |
| Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning | May 30, 2025 | Mathreinforcement-learning | —Unverified | 0 |
| A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings | May 30, 2025 | Math | CodeCode Available | 1 |
| Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking | May 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning | May 30, 2025 | MathMathematical Reasoning | CodeCode Available | 1 |
| AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning | May 30, 2025 | GPUMath | CodeCode Available | 7 |
| Let's Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM's Math Capability | May 29, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Discriminative Policy Optimization for Token-Level Reward Models | May 29, 2025 | GSM8KLanguage Modeling | CodeCode Available | 0 |
| Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation | May 29, 2025 | GSM8KMath | —Unverified | 0 |
| PBEBench: A Multi-Step Programming by Examples Reasoning Benchmark inspired by Historical Linguistics | May 29, 2025 | Math | —Unverified | 0 |
| Matryoshka Model Learning for Improved Elastic Student Models | May 29, 2025 | LAMBADAMath | —Unverified | 0 |
| Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models | May 29, 2025 | Logical ReasoningMath | —Unverified | 0 |