| Self-Imitation Learning for Robot Tasks with Sparse and Delayed Rewards | Oct 14, 2020 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| Balancing Constraints and Rewards with Meta-Gradient D4PG | Oct 13, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Hindsight Experience Replay with Kronecker Product Approximate Curvature | Oct 9, 2020 | MuJoCo | —Unverified | 0 |
| Learning Intrinsic Symbolic Rewards in Reinforcement Learning | Oct 8, 2020 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| What About Taking Policy as Input of Value Function: Policy-extended Value Function Approximator | Sep 28, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Population-Guided Imitation Learning | Sep 27, 2020 | Atari GamesImitation Learning | —Unverified | 0 |
| Soft policy optimization using dual-track advantage estimator | Sep 15, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Constrained Markov Decision Processes via Backward Value Functions | Aug 26, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Adversarial Imitation Learning via Random Search | Aug 21, 2020 | Computational EfficiencyDeep Reinforcement Learning | —Unverified | 0 |
| Forward and inverse reinforcement learning sharing network weights and hyperparameters | Aug 17, 2020 | Imitation LearningMuJoCo | —Unverified | 0 |