| On the Perturbed States for Transformed Input-robust Reinforcement Learning | Jul 31, 2024 | DenoisingMuJoCo | CodeCode Available | 0 |
| SOAP-RL: Sequential Option Advantage Propagation for Reinforcement Learning in POMDP Environments | Jul 26, 2024 | MuJoCo | CodeCode Available | 0 |
| Maximum Entropy On-Policy Actor-Critic via Entropy Advantage Estimation | Jul 25, 2024 | MuJoCo | —Unverified | 0 |
| Learning Constraint Network from Demonstrations via Positive-Unlabeled Learning with Memory Replay | Jul 23, 2024 | MuJoCo | —Unverified | 0 |
| Proximal Policy Distillation | Jul 21, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Temporal Abstraction in Reinforcement Learning with Offline Data | Jul 21, 2024 | Hierarchical Reinforcement LearningMuJoCo | —Unverified | 0 |
| LLM-Empowered State Representation for Reinforcement Learning | Jul 18, 2024 | MuJoCoreinforcement-learning | CodeCode Available | 1 |
| Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning | Jul 17, 2024 | MuJoCoreinforcement-learning | CodeCode Available | 1 |
| Constrained Intrinsic Motivation for Reinforcement Learning | Jul 12, 2024 | MuJoCoreinforcement-learning | CodeCode Available | 0 |
| A Review of Nine Physics Engines for Reinforcement Learning Research | Jul 11, 2024 | Decision MakingMuJoCo | —Unverified | 0 |
| ROER: Regularized Optimal Experience Replay | Jul 4, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents | Jun 18, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model | Jun 14, 2024 | Board Gamesmodel | CodeCode Available | 0 |
| RRLS : Robust Reinforcement Learning Suite | Jun 12, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning | Jun 12, 2024 | D4RLMuJoCo | CodeCode Available | 0 |
| Learning Reward and Policy Jointly from Demonstration and Preference Improves Alignment | Jun 11, 2024 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning | Jun 7, 2024 | Contrastive LearningMeta Reinforcement Learning | CodeCode Available | 1 |
| DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays | Jun 5, 2024 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Value Improved Actor Critic Algorithms | Jun 3, 2024 | MuJoCo | —Unverified | 0 |
| Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation | May 31, 2024 | MuJoCoreinforcement-learning | CodeCode Available | 5 |
| Trust the Model Where It Trusts Itself -- Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption | May 29, 2024 | modelModel-based Reinforcement Learning | CodeCode Available | 0 |
| Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted Regression | May 28, 2024 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| A Pontryagin Perspective on Reinforcement Learning | May 28, 2024 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales | May 27, 2024 | Atari GamesMuJoCo | CodeCode Available | 0 |
| Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning | May 27, 2024 | Gym halfcheetah-mediumGym halfcheetah-medium-expert | CodeCode Available | 2 |
| Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization | May 25, 2024 | continuous-controlContinuous Control | CodeCode Available | 2 |
| Adaptive Q-Network: On-the-fly Target Selection for Deep Reinforcement Learning | May 25, 2024 | Atari GamesAutoML | —Unverified | 0 |
| Diffusion Actor-Critic with Entropy Regulator | May 24, 2024 | Decision MakingMuJoCo | CodeCode Available | 2 |
| Variational Delayed Policy Optimization | May 23, 2024 | MuJoCoReinforcement Learning (RL) | CodeCode Available | 0 |
| Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow | May 22, 2024 | IngenuityMuJoCo | CodeCode Available | 1 |
| Learning rigid-body simulators over implicit shapes for large-scale scenes and vision | May 22, 2024 | MuJoCo | —Unverified | 0 |
| Pure Planning to Pure Policies and In Between with a Recursive Tree Planner | May 21, 2024 | MuJoCo | —Unverified | 0 |
| Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning? | May 20, 2024 | Atari GamesMamba | CodeCode Available | 0 |
| Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space | May 20, 2024 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Adaptive Exploration for Data-Efficient General Value Function Evaluations | May 13, 2024 | MuJoCo | CodeCode Available | 0 |
| Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline | May 4, 2024 | Computational EfficiencyMuJoCo | —Unverified | 0 |
| Hard-Thresholding Meets Evolution Strategies in Reinforcement Learning | May 2, 2024 | Decision MakingMuJoCo | CodeCode Available | 0 |
| Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation | May 2, 2024 | MuJoCoReinforcement Learning (RL) | CodeCode Available | 5 |
| S^2AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic | May 2, 2024 | MuJoCoVariational Inference | CodeCode Available | 1 |
| MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure | May 1, 2024 | Efficient ExplorationMuJoCo | —Unverified | 0 |
| Markov flow policy -- deep MC | May 1, 2024 | MuJoCo | —Unverified | 0 |
| No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO | May 1, 2024 | MuJoCoReinforcement Learning (RL) | CodeCode Available | 1 |
| UCB-driven Utility Function Search for Multi-objective Reinforcement Learning | May 1, 2024 | Decision MakingMuJoCo | CodeCode Available | 1 |
| Closed Loop Interactive Embodied Reasoning for Robot Manipulation | Apr 23, 2024 | MuJoCoRobot Manipulation | —Unverified | 0 |
| Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis | Apr 9, 2024 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer | Apr 8, 2024 | MuJoCoPhysical Simulations | CodeCode Available | 5 |
| DIDA: Denoised Imitation Learning based on Domain Adaptation | Apr 4, 2024 | Domain AdaptationImitation Learning | —Unverified | 0 |
| Active Learning of Dynamics Using Prior Domain Knowledge in the Sampling Process | Mar 25, 2024 | Active LearningMuJoCo | —Unverified | 0 |
| Robust Model Based Reinforcement Learning Using L_1 Adaptive Control | Mar 21, 2024 | Model-based Reinforcement LearningMuJoCo | —Unverified | 0 |
| A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization | Mar 17, 2024 | MuJoCo | —Unverified | 0 |