SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning Jun 10, 2025 Reinforcement Learning (RL)
Code Code Available 1Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning Jun 10, 2025 Large Language Model reinforcement-learning
Code Code Available 1RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling Jun 10, 2025 Computational Efficiency Reinforcement Learning (RL)
Code Code Available 1Intention-Conditioned Flow Occupancy Models Jun 10, 2025 Reinforcement Learning (RL)
Code Code Available 1Compound AI Systems Optimization: A Survey of Methods, Challenges, and Future Directions Jun 9, 2025 Reinforcement Learning (RL)
Code Code Available 1WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning Jun 9, 2025 Math Mathematical Reasoning
Code Code Available 1Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay Jun 5, 2025 Reinforcement Learning (RL)
Code Code Available 1Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models Jun 2, 2025 Instruction Following Reinforcement Learning (RL)
Code Code Available 1The Hallucination Dilemma: Factuality-Aware Reinforcement Learning for Large Reasoning Models May 30, 2025 Hallucination Mathematical Reasoning
Code Code Available 1Towards Effective Code-Integrated Reasoning May 30, 2025 Mathematical Reasoning Reinforcement Learning (RL)
Code Code Available 1Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models May 29, 2025 2k 4k
Code Code Available 1Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering May 29, 2025 Reinforcement Learning (RL)
Code Code Available 1Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles May 29, 2025 Reinforcement Learning (RL)
Code Code Available 1Normalizing Flows are Capable Models for RL May 29, 2025 Imitation Learning Reinforcement Learning (RL)
Code Code Available 1Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start May 28, 2025 Math Multimodal Reasoning
Code Code Available 1R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning May 27, 2025 Code Generation Reinforcement Learning (RL)
Code Code Available 1MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding May 27, 2025 Reinforcement Learning (RL) Video Understanding
Code Code Available 1Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL May 26, 2025 Reinforcement Learning (RL) Specificity
Code Code Available 1Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning May 25, 2025 Denoising Reinforcement Learning (RL)
Code Code Available 1SATORI-R1: Incentivizing Multimodal Reasoning with Spatial Grounding and Verifiable Rewards May 25, 2025 Image Captioning Multimodal Reasoning
Code Code Available 1SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data May 25, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1Structured Reinforcement Learning for Combinatorial Decision-Making May 25, 2025 Combinatorial Optimization Decision Making
Code Code Available 1Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs May 24, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1Co-Reinforcement Learning for Unified Multimodal Understanding and Generation May 23, 2025 Image Generation reinforcement-learning
Code Code Available 1Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning May 23, 2025 Math Reinforcement Learning (RL)
Code Code Available 1Reinforcement Learning for Ballbot Navigation in Uneven Terrain May 23, 2025 MuJoCo reinforcement-learning
Code Code Available 1The Cell Must Go On: Agar.io for Continual Reinforcement Learning May 23, 2025 Continual Learning Deep Reinforcement Learning
Code Code Available 1Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models May 22, 2025 Reinforcement Learning (RL)
Code Code Available 1RLBenchNet: The Right Network for the Right Reinforcement Learning Task May 21, 2025 continuous-control Continuous Control
Code Code Available 1GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents May 21, 2025 Answer Generation Reinforcement Learning (RL)
Code Code Available 1From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning May 21, 2025 Question Answering Reinforcement Learning (RL)
Code Code Available 1TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning May 20, 2025 Math Reinforcement Learning (RL)
Code Code Available 1Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability May 19, 2025 RAG Reinforcement Learning (RL)
Code Code Available 1Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs May 19, 2025 Reinforcement Learning (RL)
Code Code Available 1Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation May 16, 2025 Decision Making Language Modeling
Code Code Available 1ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts May 15, 2025 Continual Learning Language Modeling
Code Code Available 1Measuring General Intelligence with Generated Games May 12, 2025 In-Context Learning Large Language Model
Code Code Available 1Kalman Filter Enhanced GRPO for Reinforcement Learning-Based Language Model Reasoning May 12, 2025 Language Modeling Language Modelling
Code Code Available 1Neurophysiologically Realistic Environment for Comparing Adaptive Deep Brain Stimulation Algorithms in Parkinson Disease Apr 26, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 1Compile Scene Graphs with Reinforcement Learning Apr 18, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training Apr 13, 2025 Reinforcement Learning (RL)
Code Code Available 1Harnessing Equivariance: Modeling Turbulence with Graph Neural Networks Apr 10, 2025 Reinforcement Learning (RL)
Code Code Available 1Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining Apr 10, 2025 Mathematical Reasoning Reinforcement Learning (RL)
Code Code Available 1Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning Apr 9, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1Joint Pedestrian and Vehicle Traffic Optimization in Urban Environments using Reinforcement Learning Apr 7, 2025 Reinforcement Learning (RL) Traffic Signal Control
Code Code Available 1Concise Reasoning via Reinforcement Learning Apr 7, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models? Apr 2, 2025 Attribute Reinforcement Learning (RL)
Code Code Available 1GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical Reasoning Apr 2, 2025 Decision Making Diagnostic
Code Code Available 1ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning Apr 2, 2025 Reinforcement Learning (RL)
Code Code Available 1Probabilistically safe and efficient model-based Reinforcement Learning Apr 1, 2025 Model-based Reinforcement Learning Model Predictive Control
Code Code Available 1