| Learning Constraint Network from Demonstrations via Positive-Unlabeled Learning with Memory Replay | Jul 23, 2024 | MuJoCo | —Unverified | 0 |
| Proximal Policy Distillation | Jul 21, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Temporal Abstraction in Reinforcement Learning with Offline Data | Jul 21, 2024 | Hierarchical Reinforcement LearningMuJoCo | —Unverified | 0 |
| Constrained Intrinsic Motivation for Reinforcement Learning | Jul 12, 2024 | MuJoCoreinforcement-learning | CodeCode Available | 0 |
| A Review of Nine Physics Engines for Reinforcement Learning Research | Jul 11, 2024 | Decision MakingMuJoCo | —Unverified | 0 |
| ROER: Regularized Optimal Experience Replay | Jul 4, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents | Jun 18, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model | Jun 14, 2024 | Board Gamesmodel | CodeCode Available | 0 |
| Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning | Jun 12, 2024 | D4RLMuJoCo | CodeCode Available | 0 |
| Learning Reward and Policy Jointly from Demonstration and Preference Improves Alignment | Jun 11, 2024 | MuJoCoreinforcement-learning | —Unverified | 0 |