| ORRB -- OpenAI Remote Rendering Backend | Jun 26, 2019 | MuJoCo | CodeCode Available | 0 |
| Exploring Model-based Planning with Policy Networks | Jun 20, 2019 | Benchmarkingmodel | CodeCode Available | 0 |
| Out-of-Dynamics Imitation Learning from Multimodal Demonstrations | Nov 13, 2022 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning | Jun 30, 2021 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Self-Imitation Learning | Jun 14, 2018 | Atari GamesImitation Learning | CodeCode Available | 0 |
| Self-Imitation Learning for Robot Tasks with Sparse and Delayed Rewards | Oct 14, 2020 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| P3O: Policy-on Policy-off Policy Optimization | May 5, 2019 | MuJoCoReinforcement Learning | CodeCode Available | 0 |
| Parameter-free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy Gradients | Sep 24, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Self Reward Design with Fine-grained Interpretability | Dec 30, 2021 | Deep Reinforcement LearningFairness | CodeCode Available | 0 |
| Explaining RL Decisions with Trajectories | May 6, 2023 | Attributecontinuous-control | CodeCode Available | 0 |