| Adversarial Imitation Learning via Random Search | Aug 21, 2020 | Computational EfficiencyDeep Reinforcement Learning | —Unverified | 0 |
| Imitation Learning with Sinkhorn Distances | Aug 20, 2020 | Imitation LearningMuJoCo | CodeCode Available | 1 |
| Forward and inverse reinforcement learning sharing network weights and hyperparameters | Aug 17, 2020 | Imitation LearningMuJoCo | —Unverified | 0 |
| Overcoming Model Bias for Robust Offline Deep Reinforcement Learning | Aug 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Contrastive Variational Reinforcement Learning for Complex Observations | Aug 6, 2020 | Atari GamesContinuous Control | CodeCode Available | 1 |
| Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals | Aug 5, 2020 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Robust Deep Reinforcement Learning through Adversarial Loss | Aug 5, 2020 | Adversarial AttackAtari Games | CodeCode Available | 1 |
| Weak Human Preference Supervision For Deep Reinforcement Learning | Jul 25, 2020 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Nengo and low-power AI hardware for robust, embedded neurorobotics | Jul 20, 2020 | MuJoCo | CodeCode Available | 1 |
| Learning to Play Cup-and-Ball with Noisy Camera Observations | Jul 19, 2020 | MuJoCo | CodeCode Available | 0 |
| CoNES: Convex Natural Evolutionary Strategies | Jul 16, 2020 | BenchmarkingMuJoCo | —Unverified | 0 |
| Inverse Reinforcement Learning from a Gradient-based Learner | Jul 15, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay | Jul 12, 2020 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 |
| Fast Adaptation via Policy-Dynamics Value Functions | Jul 6, 2020 | MuJoCo | CodeCode Available | 1 |
| Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient | Jul 3, 2020 | BenchmarkingMuJoCo | CodeCode Available | 1 |
| Regularly Updated Deterministic Policy Gradient Algorithm | Jul 1, 2020 | MuJoCoQ-Learning | —Unverified | 0 |
| DDPG++: Striving for Simplicity in Continuous-control Off-Policy Reinforcement Learning | Jun 26, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| SOAC: The Soft Option Actor-Critic Architecture | Jun 25, 2020 | MuJoCoTransfer Learning | —Unverified | 0 |
| ELSIM: End-to-end learning of reusable skills through intrinsic motivation | Jun 23, 2020 | Developmental LearningMuJoCo | —Unverified | 0 |
| dm_control: Software and Tasks for Continuous Control | Jun 22, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Learning Invariant Representations for Reinforcement Learning without Reconstruction | Jun 18, 2020 | Causal InferenceMuJoCo | CodeCode Available | 1 |
| Converting Biomechanical Models from OpenSim to MuJoCo | Jun 17, 2020 | MuJoCoreinforcement-learning | CodeCode Available | 1 |
| MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration | Jun 15, 2020 | Efficient ExplorationMeta Reinforcement Learning | CodeCode Available | 1 |
| Non-local Policy Optimization via Diversity-regularized Collaborative Exploration | Jun 14, 2020 | DiversityMuJoCo | —Unverified | 0 |
| Continuous Control for Searching and Planning with a Learned Model | Jun 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Decorrelated Double Q-learning | Jun 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| From proprioception to long-horizon planning in novel environments: A hierarchical RL model | Jun 11, 2020 | Efficient ExplorationModel Predictive Control | —Unverified | 0 |
| Primal Wasserstein Imitation Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration | Jun 5, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Cross-Domain Imitation Learning with a Dual Structure | Jun 2, 2020 | Imitation LearningMuJoCo | —Unverified | 0 |
| Gradient Monitored Reinforcement Learning | May 25, 2020 | Atari Gamescontinuous-control | —Unverified | 0 |
| Novel Policy Seeking with Constrained Optimization | May 21, 2020 | DiversityMuJoCo | CodeCode Available | 0 |
| Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning | May 14, 2020 | Adversarial AttackDeep Reinforcement Learning | —Unverified | 0 |
| Delay-Aware Model-Based Reinforcement Learning for Continuous Control | May 11, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control | May 1, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization | Apr 29, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Evolutionary Stochastic Policy Distillation | Apr 27, 2020 | MuJoCoReinforcement Learning | CodeCode Available | 0 |
| Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning | Apr 22, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| FACMAC: Factored Multi-Agent Centralised Policy Gradients | Mar 14, 2020 | MuJoCoMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Relevance-Guided Modeling of Object Dynamics for Reinforcement Learning | Mar 3, 2020 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 |
| Gaussian Process Policy Optimization | Mar 2, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| State-only Imitation with Transition Dynamics Mismatch | Feb 27, 2020 | Imitation LearningMuJoCo | CodeCode Available | 1 |
| Robust Reinforcement Learning via Adversarial training with Langevin Dynamics | Feb 14, 2020 | MuJoCoreinforcement-learning | CodeCode Available | 0 |
| Generalized Hidden Parameter MDPs Transferable Model-based RL in a Handful of Trials | Feb 8, 2020 | MuJoCo | —Unverified | 0 |
| Multi-task Reinforcement Learning with a Planning Quasi-Metric | Feb 8, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Temporal-adaptive Hierarchical Reinforcement Learning | Feb 6, 2020 | Atari GamesHierarchical Reinforcement Learning | —Unverified | 0 |
| Periodic Intra-Ensemble Knowledge Distillation for Reinforcement Learning | Feb 1, 2020 | Knowledge DistillationMuJoCo | CodeCode Available | 0 |
| Lyceum: An efficient and scalable ecosystem for robot learning | Jan 21, 2020 | Model Predictive ControlMuJoCo | —Unverified | 0 |
| Effects of sparse rewards of different magnitudes in the speed of learning of model-based actor critic methods | Jan 18, 2020 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| SEERL: Sample Efficient Ensemble Reinforcement Learning | Jan 15, 2020 | continuous-controlContinuous Control | —Unverified | 0 |