MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Mar 10, 2025 Multimodal Reasoning Reinforcement Learning (RL)
Code Code Available 4Mastering Diverse Domains through World Models Jan 10, 2023 Atari Games 100k Decision Making
Code Code Available 4RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem Nov 25, 2020 reinforcement-learning Reinforcement Learning
Code Code Available 4Kwai Keye-VL Technical Report Jul 2, 2025 Instruction Following Reinforcement Learning (RL)
Code Code Available 4Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Mar 18, 2025 3D Face Animation Common Sense Reasoning
Code Code Available 4LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Mar 10, 2025 Logical Reasoning Multimodal Reasoning
Code Code Available 4Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models Apr 15, 2025 Humanoid Control Reinforcement Learning (RL)
Code Code Available 4CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models May 18, 2025 Reinforcement Learning (RL)
Code Code Available 4TorchRL: A data-driven decision-making library for PyTorch Jun 1, 2023 Computational Efficiency Decision Making
Code Code Available 4Discovering faster matrix multiplication algorithms with reinforcement learning Oct 5, 2022 Deep Reinforcement Learning reinforcement-learning
Code Code Available 4TDMPBC: Self-Imitative Reinforcement Learning for Humanoid Robot Control Feb 24, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 4Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Mar 20, 2025 Benchmarking Reinforcement Learning (RL)
Code Code Available 4T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT May 1, 2025 Image Generation Reinforcement Learning (RL)
Code Code Available 4Video-R1: Reinforcing Video Reasoning in MLLMs Mar 27, 2025 MVBench Reinforcement Learning (RL)
Code Code Available 4SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning Aug 14, 2024 CPU Motion Planning
Code Code Available 4RLlib: Abstractions for Distributed Reinforcement Learning Dec 26, 2017 reinforcement-learning Reinforcement Learning
Code Code Available 4s3: You Don't Need That Much Data to Train a Search Agent via RL May 20, 2025 RAG Reinforcement Learning (RL)
Code Code Available 4RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark Jun 29, 2023 Combinatorial Optimization Computational Efficiency
Code Code Available 4Skywork Open Reasoner 1 Technical Report May 28, 2025 Math Reinforcement Learning (RL)
Code Code Available 4R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO May 22, 2025 Reinforcement Learning (RL)
Code Code Available 3Discovered Policy Optimisation Oct 11, 2022 Ingenuity Meta-Learning
Code Code Available 3Distributed Prioritized Experience Replay Mar 2, 2018 Atari Games Deep Reinforcement Learning
Code Code Available 3Practical Deep Reinforcement Learning Approach for Stock Trading Nov 19, 2018 Deep Reinforcement Learning reinforcement-learning
Code Code Available 3R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning May 5, 2025 Reinforcement Learning (RL)
Code Code Available 3Rainbow: Combining Improvements in Deep Reinforcement Learning Oct 6, 2017 Atari Games Deep Reinforcement Learning
Code Code Available 3OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning May 13, 2025 Reinforcement Learning (RL) Visual Reasoning
Code Code Available 3Demystifying Long Chain-of-Thought Reasoning in LLMs Feb 5, 2025 Reinforcement Learning (RL)
Code Code Available 3OpenSpiel: A Framework for Reinforcement Learning in Games Aug 26, 2019 General Reinforcement Learning reinforcement-learning
Code Code Available 3On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning Nov 10, 2021 Multi-agent Reinforcement Learning reinforcement-learning
Code Code Available 3OGBench: Benchmarking Offline Goal-Conditioned RL Oct 26, 2024 Benchmarking reinforcement-learning
Code Code Available 3Adversarial Cheap Talk Nov 20, 2022 Meta-Learning Reinforcement Learning (RL)
Code Code Available 3OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research May 16, 2023 Philosophy reinforcement-learning
Code Code Available 3Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning Feb 5, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 3Perception-R1: Pioneering Perception Policy with Reinforcement Learning Apr 10, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 3Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning Feb 26, 2024 GPU Minecraft
Code Code Available 3Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning Oct 11, 2022 reinforcement-learning Reinforcement Learning
Code Code Available 3Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Mar 3, 2025 Reinforcement Learning (RL)
Code Code Available 3MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library Oct 11, 2022 Multi-agent Reinforcement Learning reinforcement-learning
Code Code Available 3Learning Bipedal Walking On Planned Footsteps For Humanoid Robots Jul 26, 2022 Deep Reinforcement Learning MuJoCo
Code Code Available 3Learning Bipedal Walking for Humanoids with Current Feedback Mar 7, 2023 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 3Learning to Reason under Off-Policy Guidance Apr 21, 2025 Math Reinforcement Learning (RL)
Code Code Available 3CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms Nov 16, 2021 Benchmarking Deep Reinforcement Learning
Code Code Available 3CarDreamer: Open-Source Learning Platform for World Model based Autonomous Driving May 15, 2024 Autonomous Driving Autonomous Vehicles
Code Code Available 3MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse Mar 24, 2025 Layout Generation Reinforcement Learning (RL)
Code Code Available 3Is Value Learning Really the Main Bottleneck in Offline RL? Jun 13, 2024 Imitation Learning Offline RL
Code Code Available 3CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control Oct 4, 2024 Motion Generation Reinforcement Learning (RL)
Code Code Available 3DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning Apr 15, 2025 Mathematical Reasoning Reinforcement Learning (RL)
Code Code Available 3Deep Reinforcement Learning Oct 15, 2018 Deep Reinforcement Learning Management
Code Code Available 3Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey on Hybrid Algorithms Jan 22, 2024 Evolutionary Algorithms reinforcement-learning
Code Code Available 3imitation: Clean Imitation Learning Implementations Nov 22, 2022 Imitation Learning reinforcement-learning
Code Code Available 3