| First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation | Dec 6, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| DEFENDER: DTW-Based Episode Filtering Using Demonstrations for Enhancing RL Safety | May 8, 2023 | MuJoCo | —Unverified | 0 |
| Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals | Aug 5, 2020 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Bayesian Distributional Policy Gradients | Mar 20, 2021 | Atari GamesContrastive Learning | —Unverified | 0 |
| Formal Language Constrained Markov Decision Processes | Jan 1, 2021 | MuJoCo | —Unverified | 0 |
| CLARE: Conservative Model-Based Reward Learning for Offline Inverse Reinforcement Learning | Feb 9, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility | Oct 8, 2023 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 |
| From proprioception to long-horizon planning in novel environments: A hierarchical RL model | Jun 11, 2020 | Efficient ExplorationModel Predictive Control | —Unverified | 0 |
| Gaussian Process Policy Optimization | Mar 2, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Hamiltonian Policy Optimization | Feb 28, 2021 | continuous-controlContinuous Control | —Unverified | 0 |