| Partial advantage estimator for proximal policy optimization | Jan 26, 2023 | MuJoCoPolicy Gradient Methods | CodeCode Available | 1 | 5 |
| FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning | Oct 4, 2020 | GPUMuJoCo | CodeCode Available | 1 | 5 |
| Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past | Jun 10, 2019 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 | 5 |
| An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay | Jul 12, 2020 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 | 5 |
| Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments | Jan 20, 2021 | MuJoCo | CodeCode Available | 1 | 5 |
| Improving Sample Efficiency in Model-Free Reinforcement Learning from Images | Oct 2, 2019 | Image ReconstructionMuJoCo | CodeCode Available | 1 | 5 |
| Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors | Jan 9, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Reset-Free Lifelong Learning with Skill-Space Planning | Dec 7, 2020 | Lifelong learningMuJoCo | CodeCode Available | 1 | 5 |
| An Open-Source Multi-Goal Reinforcement Learning Environment for Robotic Manipulation with Pybullet | May 12, 2021 | MuJoCoMulti-Goal Reinforcement Learning | CodeCode Available | 1 | 5 |
| Joint action loss for proximal policy optimization | Jan 26, 2023 | Dota 2MuJoCo | CodeCode Available | 1 | 5 |