| Fine-Tuning Offline Reinforcement Learning with Model-Based Policy Optimization | Jan 1, 2021 | D4RLMuJoCo | —Unverified | 0 |
| Practical Marginalized Importance Sampling with the Successor Representation | Jan 1, 2021 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| CAT-SAC: Soft Actor-Critic with Curiosity-Aware Entropy Temperature | Jan 1, 2021 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Adaptive N-step Bootstrapping with Off-policy Data | Jan 1, 2021 | Atari GamesMuJoCo | —Unverified | 0 |
| Intrinsically Guided Exploration in Meta Reinforcement Learning | Jan 1, 2021 | Deep Reinforcement LearningEfficient Exploration | —Unverified | 0 |
| Invariant Representations for Reinforcement Learning without Reconstruction | Jan 1, 2021 | Causal InferenceMuJoCo | —Unverified | 0 |
| TEAC: Intergrating Trust Region and Max Entropy Actor Critic for Continuous Control | Jan 1, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| PGPS : Coupling Policy Gradient with Population-based Search | Jan 1, 2021 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards | Dec 26, 2020 | continuous-controlContinuous Control | CodeCode Available | 0 |
| OPAC: Opportunistic Actor-Critic | Dec 11, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Offline Imitation Learning with a Misspecified Simulator | Dec 1, 2020 | Decision MakingFriction | —Unverified | 0 |
| Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp | Nov 30, 2020 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Weighted Entropy Modification for Soft Actor-Critic | Nov 18, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Proximal Policy Optimization via Enhanced Exploration Efficiency | Nov 11, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Sim2Sim Evaluation of a Novel Data-Efficient Differentiable Physics Engine for Tensegrity Robots | Nov 10, 2020 | MuJoCo | —Unverified | 0 |
| Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping | Nov 5, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Cooperative Heterogeneous Deep Reinforcement Learning | Nov 2, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Can Reinforcement Learning for Continuous Control Generalize Across Physics Engines? | Oct 27, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification | Oct 20, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Human-guided Robot Behavior Learning: A GAN-assisted Preference-based Reinforcement Learning Approach | Oct 15, 2020 | Generative Adversarial NetworkMuJoCo | CodeCode Available | 0 |
| Self-Imitation Learning for Robot Tasks with Sparse and Delayed Rewards | Oct 14, 2020 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| Balancing Constraints and Rewards with Meta-Gradient D4PG | Oct 13, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Hindsight Experience Replay with Kronecker Product Approximate Curvature | Oct 9, 2020 | MuJoCo | —Unverified | 0 |
| Learning Intrinsic Symbolic Rewards in Reinforcement Learning | Oct 8, 2020 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| What About Taking Policy as Input of Value Function: Policy-extended Value Function Approximator | Sep 28, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Population-Guided Imitation Learning | Sep 27, 2020 | Atari GamesImitation Learning | —Unverified | 0 |
| Soft policy optimization using dual-track advantage estimator | Sep 15, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Constrained Markov Decision Processes via Backward Value Functions | Aug 26, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Adversarial Imitation Learning via Random Search | Aug 21, 2020 | Computational EfficiencyDeep Reinforcement Learning | —Unverified | 0 |
| Forward and inverse reinforcement learning sharing network weights and hyperparameters | Aug 17, 2020 | Imitation LearningMuJoCo | —Unverified | 0 |
| Overcoming Model Bias for Robust Offline Deep Reinforcement Learning | Aug 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals | Aug 5, 2020 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Weak Human Preference Supervision For Deep Reinforcement Learning | Jul 25, 2020 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Learning to Play Cup-and-Ball with Noisy Camera Observations | Jul 19, 2020 | MuJoCo | CodeCode Available | 0 |
| CoNES: Convex Natural Evolutionary Strategies | Jul 16, 2020 | BenchmarkingMuJoCo | —Unverified | 0 |
| Inverse Reinforcement Learning from a Gradient-based Learner | Jul 15, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Regularly Updated Deterministic Policy Gradient Algorithm | Jul 1, 2020 | MuJoCoQ-Learning | —Unverified | 0 |
| DDPG++: Striving for Simplicity in Continuous-control Off-Policy Reinforcement Learning | Jun 26, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| SOAC: The Soft Option Actor-Critic Architecture | Jun 25, 2020 | MuJoCoTransfer Learning | —Unverified | 0 |
| ELSIM: End-to-end learning of reusable skills through intrinsic motivation | Jun 23, 2020 | Developmental LearningMuJoCo | —Unverified | 0 |
| dm_control: Software and Tasks for Continuous Control | Jun 22, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Non-local Policy Optimization via Diversity-regularized Collaborative Exploration | Jun 14, 2020 | DiversityMuJoCo | —Unverified | 0 |
| Continuous Control for Searching and Planning with a Learned Model | Jun 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Decorrelated Double Q-learning | Jun 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| From proprioception to long-horizon planning in novel environments: A hierarchical RL model | Jun 11, 2020 | Efficient ExplorationModel Predictive Control | —Unverified | 0 |
| Primal Wasserstein Imitation Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Cross-Domain Imitation Learning with a Dual Structure | Jun 2, 2020 | Imitation LearningMuJoCo | —Unverified | 0 |
| Gradient Monitored Reinforcement Learning | May 25, 2020 | Atari Gamescontinuous-control | —Unverified | 0 |
| Novel Policy Seeking with Constrained Optimization | May 21, 2020 | DiversityMuJoCo | CodeCode Available | 0 |
| Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning | May 14, 2020 | Adversarial AttackDeep Reinforcement Learning | —Unverified | 0 |