| CoNES: Convex Natural Evolutionary Strategies | Jul 16, 2020 | BenchmarkingMuJoCo | —Unverified | 0 |
| Inverse Reinforcement Learning from a Gradient-based Learner | Jul 15, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay | Jul 12, 2020 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 |
| Fast Adaptation via Policy-Dynamics Value Functions | Jul 6, 2020 | MuJoCo | CodeCode Available | 1 |
| Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient | Jul 3, 2020 | BenchmarkingMuJoCo | CodeCode Available | 1 |
| Regularly Updated Deterministic Policy Gradient Algorithm | Jul 1, 2020 | MuJoCoQ-Learning | —Unverified | 0 |
| DDPG++: Striving for Simplicity in Continuous-control Off-Policy Reinforcement Learning | Jun 26, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| SOAC: The Soft Option Actor-Critic Architecture | Jun 25, 2020 | MuJoCoTransfer Learning | —Unverified | 0 |
| ELSIM: End-to-end learning of reusable skills through intrinsic motivation | Jun 23, 2020 | Developmental LearningMuJoCo | —Unverified | 0 |
| dm_control: Software and Tasks for Continuous Control | Jun 22, 2020 | continuous-controlContinuous Control | —Unverified | 0 |