| MuJoCo MPC for Humanoid Control: Evaluation on HumanoidBench | Aug 1, 2024 | Humanoid ControlMuJoCo | CodeCode Available | 5 |
| Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation | May 31, 2024 | MuJoCoreinforcement-learning | CodeCode Available | 5 |
| Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation | May 2, 2024 | MuJoCoReinforcement Learning (RL) | CodeCode Available | 5 |
| Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer | Apr 8, 2024 | MuJoCoPhysical Simulations | CodeCode Available | 5 |
| EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine | Jun 21, 2022 | MuJoCoreinforcement-learning | CodeCode Available | 5 |
| Streaming Deep Reinforcement Learning Finally Works | Oct 18, 2024 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 3 |
| XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library | Dec 25, 2023 | CPUDeep Reinforcement Learning | CodeCode Available | 3 |
| Learning Bipedal Walking On Planned Footsteps For Humanoid Robots | Jul 26, 2022 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 3 |
| Tianshou: a Highly Modularized Deep Reinforcement Learning Library | Jul 29, 2021 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 3 |
| Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement | Oct 15, 2024 | DisentanglementInductive Bias | CodeCode Available | 2 |
| Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning | May 27, 2024 | Gym halfcheetah-mediumGym halfcheetah-medium-expert | CodeCode Available | 2 |
| Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization | May 25, 2024 | continuous-controlContinuous Control | CodeCode Available | 2 |
| Diffusion Actor-Critic with Entropy Regulator | May 24, 2024 | Decision MakingMuJoCo | CodeCode Available | 2 |
| Simple Policy Optimization | Jan 29, 2024 | MuJoCo | CodeCode Available | 2 |
| Text2Reward: Reward Shaping with Language Models for Reinforcement Learning | Sep 20, 2023 | MuJoCoreinforcement-learning | CodeCode Available | 2 |
| Maximum Entropy Heterogeneous-Agent Reinforcement Learning | Jun 19, 2023 | MuJoCoMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| Multi-Agent Reinforcement Learning is a Sequence Modeling Problem | May 30, 2022 | Decision MakingMuJoCo | CodeCode Available | 2 |
| JORLDY: a fully customizable open source framework for reinforcement learning | Apr 11, 2022 | MuJoCoOpenAI Gym | CodeCode Available | 2 |
| Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation | Jun 24, 2021 | MuJoCoOpenAI Gym | CodeCode Available | 2 |
| robosuite: A Modular Simulation Framework and Benchmark for Robot Learning | Sep 25, 2020 | Gesture GenerationMuJoCo | CodeCode Available | 2 |
| Deep Reinforcement Learning with Gradient Eligibility Traces | Jul 12, 2025 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 |
| Reinforcement Learning for Ballbot Navigation in Uneven Terrain | May 23, 2025 | MuJoCoreinforcement-learning | CodeCode Available | 1 |
| Model Tensor Planning | May 2, 2025 | modelModel Predictive Control | CodeCode Available | 1 |
| An Real-Sim-Real (RSR) Loop Framework for Generalizable Robotic Policy Transfer with Differentiable Simulation | Mar 13, 2025 | MuJoCo | CodeCode Available | 1 |
| Maximum Entropy Reinforcement Learning with Diffusion Policy | Feb 17, 2025 | Efficient ExplorationMuJoCo | CodeCode Available | 1 |
| Doubly Mild Generalization for Offline Reinforcement Learning | Nov 12, 2024 | MuJoCoOffline RL | CodeCode Available | 1 |
| FM-TS: Flow Matching for Time Series Generation | Nov 12, 2024 | BenchmarkingImputation | CodeCode Available | 1 |
| Zonal RL-RRT: Integrated RL-RRT Path Planning with Collision Probability and Zone Connectivity | Oct 31, 2024 | MuJoCoQ-Learning | CodeCode Available | 1 |
| Learning Successor Features the Simple Way | Oct 29, 2024 | Continual LearningDeep Reinforcement Learning | CodeCode Available | 1 |
| Balanced Neural ODEs: nonlinear model order reduction and Koopman operator approximations | Oct 14, 2024 | Dimensionality ReductionMuJoCo | CodeCode Available | 1 |
| Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning | Oct 11, 2024 | DiversityMuJoCo | CodeCode Available | 1 |
| LLM-Empowered State Representation for Reinforcement Learning | Jul 18, 2024 | MuJoCoreinforcement-learning | CodeCode Available | 1 |
| Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning | Jul 17, 2024 | MuJoCoreinforcement-learning | CodeCode Available | 1 |
| RRLS : Robust Reinforcement Learning Suite | Jun 12, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning | Jun 7, 2024 | Contrastive LearningMeta Reinforcement Learning | CodeCode Available | 1 |
| Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow | May 22, 2024 | IngenuityMuJoCo | CodeCode Available | 1 |
| S^2AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic | May 2, 2024 | MuJoCoVariational Inference | CodeCode Available | 1 |
| No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO | May 1, 2024 | MuJoCoReinforcement Learning (RL) | CodeCode Available | 1 |
| UCB-driven Utility Function Search for Multi-objective Reinforcement Learning | May 1, 2024 | Decision MakingMuJoCo | CodeCode Available | 1 |
| Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space Inference | Feb 7, 2024 | MuJoCo | CodeCode Available | 1 |
| Efficient Reinforcement Learning via Decoupling Exploration and Utilization | Dec 26, 2023 | Autonomous VehiclesMuJoCo | CodeCode Available | 1 |
| World Models via Policy-Guided Trajectory Diffusion | Dec 13, 2023 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Optimistic Multi-Agent Policy Gradient | Nov 3, 2023 | MuJoCoQ-Learning | CodeCode Available | 1 |
| Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning | Oct 19, 2023 | MuJoCoPrompt Engineering | CodeCode Available | 1 |
| Practical Probabilistic Model-based Deep Reinforcement Learning by Integrating Dropout Uncertainty and Trajectory Sampling | Sep 20, 2023 | Deep Reinforcement LearningModel-based Reinforcement Learning | CodeCode Available | 1 |
| A Bayesian Approach to Robust Inverse Reinforcement Learning | Sep 15, 2023 | Imitation LearningMuJoCo | CodeCode Available | 1 |
| Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization | Jul 21, 2023 | ManagementMuJoCo | CodeCode Available | 1 |
| Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation | Jul 17, 2023 | MuJoCoreinforcement-learning | CodeCode Available | 1 |
| Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration | May 29, 2023 | MuJoCo | CodeCode Available | 1 |
| Policy Representation via Diffusion Probability Model for Reinforcement Learning | May 22, 2023 | continuous-controlContinuous Control | CodeCode Available | 1 |