| Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning | May 30, 2025 | MathMathematical Reasoning | CodeCode Available | 1 |
| SiLVR: A Simple Language-based Video Reasoning Framework | May 30, 2025 | MathMME | CodeCode Available | 1 |
| A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings | May 30, 2025 | Math | CodeCode Available | 1 |
| ChatVLA-2: Vision-Language-Action Model with Open-World Embodied Reasoning from Pretrained Knowledge | May 28, 2025 | Imitation LearningMath | CodeCode Available | 1 |
| Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start | May 28, 2025 | MathMultimodal Reasoning | CodeCode Available | 1 |
| REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning | May 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging | May 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| RaDeR: Reasoning-aware Dense Retrieval Models | May 23, 2025 | MathMathematical Problem-Solving | CodeCode Available | 1 |
| Value-Guided Search for Efficient Chain-of-Thought Reasoning | May 23, 2025 | Math | CodeCode Available | 1 |
| Decoupled Visual Interpretation and Linguistic Reasoning for Math Problem Solving | May 23, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning | May 23, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 1 |
| Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs | May 22, 2025 | DiagnosticMachine Unlearning | CodeCode Available | 1 |
| ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges | May 21, 2025 | Mathvalid | CodeCode Available | 1 |
| Training Step-Level Reasoning Verifiers with Formal Verification Tools | May 21, 2025 | Formal LogicMath | CodeCode Available | 1 |
| The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning | May 21, 2025 | Math | CodeCode Available | 1 |
| Let's Verify Math Questions Step by Step | May 20, 2025 | MathMathematical Reasoning | CodeCode Available | 1 |
| TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning | May 20, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 1 |
| Efficient RL Training for Reasoning Models via Length-Aware Optimization | May 18, 2025 | Math | CodeCode Available | 1 |
| HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM Systems | May 17, 2025 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |
| MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports | May 16, 2025 | DiagnosticMath | CodeCode Available | 1 |
| Kalman Filter Enhanced GRPO for Reinforcement Learning-Based Language Model Reasoning | May 12, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Rewriting Pre-Training Data Boosts LLM Performance in Math and Code | May 5, 2025 | Code GenerationGSM8K | CodeCode Available | 1 |
| DeepCritic: Deliberate Critique with Large Language Models | May 1, 2025 | Math | CodeCode Available | 1 |
| NeMo-Inspector: A Visualization Tool for LLM Generation Analysis | May 1, 2025 | GSM8KMath | CodeCode Available | 1 |
| Efficient Reasoning for LLMs through Speculative Chain-of-Thought | Apr 27, 2025 | GSM8KMath | CodeCode Available | 1 |