| Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors | Jan 9, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Universal Successor Features for Transfer Reinforcement Learning | Jan 5, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Fast Adaptation to New Environments via Policy-Dynamics Value Functions | Jan 1, 2020 | MuJoCo | —Unverified | 0 |
| Inferring DQN structure for high-dimensional continuous control | Jan 1, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| Recruitment-imitation Mechanism for Evolutionary Reinforcement Learning | Dec 13, 2019 | continuous-controlContinuous Control | —Unverified | 0 |
| Parareal with a Learned Coarse Model for Robotic Manipulation | Dec 12, 2019 | MuJoCo | —Unverified | 0 |
| Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online | Nov 19, 2019 | Continual Learningcontinuous-control | —Unverified | 0 |
| MANGA: Method Agnostic Neural-policy Generalization and Adaptation | Nov 19, 2019 | Imitation LearningMuJoCo | —Unverified | 0 |
| Gradientless Descent: High-Dimensional Zeroth-Order Optimization | Nov 14, 2019 | MuJoCoVocal Bursts Intensity Prediction | —Unverified | 0 |
| Multi-Path Policy Optimization | Nov 11, 2019 | Deep Reinforcement LearningEfficient Exploration | —Unverified | 0 |
| Asynchronous Methods for Model-Based Reinforcement Learning | Oct 28, 2019 | modelModel-based Reinforcement Learning | CodeCode Available | 0 |
| BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning | Oct 27, 2019 | Deep Reinforcement LearningImitation Learning | CodeCode Available | 0 |
| Unifying Variational Inference and PAC-Bayes for Supervised Learning that Scales | Oct 23, 2019 | MuJoCoVariational Inference | CodeCode Available | 0 |
| VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning | Oct 18, 2019 | Meta-LearningMuJoCo | CodeCode Available | 1 |
| On the Expressivity of Neural Networks for Deep Reinforcement Learning | Oct 14, 2019 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards | Oct 10, 2019 | Hierarchical Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Multi-step Greedy Reinforcement Learning Algorithms | Oct 7, 2019 | Continuous ControlGame of Go | —Unverified | 0 |
| Learning Calibratable Policies using Programmatic Style-Consistency | Oct 2, 2019 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| Formal Language Constraints for Markov Decision Processes | Oct 2, 2019 | Atari GamesMuJoCo | CodeCode Available | 0 |
| Improving Sample Efficiency in Model-Free Reinforcement Learning from Images | Oct 2, 2019 | Image ReconstructionMuJoCo | CodeCode Available | 1 |
| Learning from Observations Using a Single Video Demonstration and Human Feedback | Sep 29, 2019 | MuJoCo | —Unverified | 0 |
| A Generalized Training Approach for Multiagent Learning | Sep 27, 2019 | MuJoCo | —Unverified | 0 |
| Relationship Explainable Multi-objective Reinforcement Learning with Semantic Explainability Generation | Sep 26, 2019 | MuJoCoMulti-Objective Reinforcement Learning | —Unverified | 0 |
| Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning | Sep 25, 2019 | Decision MakingKnowledge Distillation | —Unverified | 0 |
| Deep exploration by novelty-pursuit with maximum state entropy | Sep 25, 2019 | Efficient ExplorationMuJoCo | —Unverified | 0 |
| Regulatory Focus: Promotion and Prevention Inclinations in Policy Search | Sep 25, 2019 | Atari Gamescontinuous-control | —Unverified | 0 |
| Risk Averse Value Expansion for Sample Efficient and Robust Policy Learning | Sep 25, 2019 | Model-based Reinforcement LearningMuJoCo | —Unverified | 0 |
| CrossNorm: On Normalization for Off-Policy Reinforcement Learning | Sep 25, 2019 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Stabilizing Off-Policy Reinforcement Learning with Conservative Policy Gradients | Sep 25, 2019 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Bootstrapping the Expressivity with Model-based Planning | Sep 25, 2019 | modelMuJoCo | CodeCode Available | 0 |
| Policy Tree Network | Sep 25, 2019 | Model-based Reinforcement LearningMuJoCo | —Unverified | 0 |
| Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning | Sep 25, 2019 | continuous-controlContinuous Control | —Unverified | 0 |
| Learning Latent Representations for Inverse Dynamics using Generalized Experiences | Sep 25, 2019 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Safe Policy Learning for Continuous Control | Sep 25, 2019 | continuous-controlContinuous Control | —Unverified | 0 |
| Multi-task Batch Reinforcement Learning with Metric Learning | Sep 25, 2019 | Meta Reinforcement LearningMetric Learning | —Unverified | 0 |
| MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning | Sep 17, 2019 | MuJoCoOpenAI Gym | CodeCode Available | 0 |
| Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning | Sep 17, 2019 | continuous-controlContinuous Control | —Unverified | 0 |
| Biased Estimates of Advantages over Path Ensembles | Sep 15, 2019 | Atari Gamescontinuous-control | —Unverified | 0 |
| Policy Prediction Network: Model-Free Behavior Policy with Model-Based Learning in Continuous Action Space | Sep 15, 2019 | continuous-controlContinuous Control | —Unverified | 0 |
| Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning | Sep 11, 2019 | MuJoCoQ-Learning | —Unverified | 0 |
| Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning | Sep 7, 2019 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity | Aug 14, 2019 | DecoderDeep Reinforcement Learning | —Unverified | 0 |
| Towards Model-based Reinforcement Learning for Industry-near Environments | Jul 27, 2019 | Deep Reinforcement LearningModel-based Reinforcement Learning | CodeCode Available | 0 |
| A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment | Jul 26, 2019 | MuJoCoReinforcement Learning | —Unverified | 0 |
| Learning Policies through Quantile Regression | Jun 27, 2019 | MuJoCoquantile regression | —Unverified | 0 |
| ORRB -- OpenAI Remote Rendering Backend | Jun 26, 2019 | MuJoCo | CodeCode Available | 0 |
| Exploring Model-based Planning with Policy Networks | Jun 20, 2019 | Benchmarkingmodel | CodeCode Available | 0 |
| Calibrated Model-Based Deep Reinforcement Learning | Jun 19, 2019 | Deep Reinforcement Learningmodel | CodeCode Available | 0 |
| Reward Prediction Error as an Exploration Objective in Deep RL | Jun 19, 2019 | Atari GamesContinuous Control | —Unverified | 0 |
| Robust Reinforcement Learning for Continuous Control with Model Misspecification | Jun 18, 2019 | continuous-controlContinuous Control | —Unverified | 0 |