| FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning | Oct 4, 2020 | GPUMuJoCo | CodeCode Available | 1 |
| Revisiting Design Choices in Proximal Policy Optimization | Sep 23, 2020 | MuJoCo | CodeCode Available | 1 |
| Sample-Efficient Automated Deep Reinforcement Learning | Sep 3, 2020 | Deep Reinforcement LearningHyperparameter Optimization | CodeCode Available | 1 |
| Imitation Learning with Sinkhorn Distances | Aug 20, 2020 | Imitation LearningMuJoCo | CodeCode Available | 1 |
| Contrastive Variational Reinforcement Learning for Complex Observations | Aug 6, 2020 | Atari GamesContinuous Control | CodeCode Available | 1 |
| Robust Deep Reinforcement Learning through Adversarial Loss | Aug 5, 2020 | Adversarial AttackAtari Games | CodeCode Available | 1 |
| Nengo and low-power AI hardware for robust, embedded neurorobotics | Jul 20, 2020 | MuJoCo | CodeCode Available | 1 |
| An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay | Jul 12, 2020 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 |
| Fast Adaptation via Policy-Dynamics Value Functions | Jul 6, 2020 | MuJoCo | CodeCode Available | 1 |
| Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient | Jul 3, 2020 | BenchmarkingMuJoCo | CodeCode Available | 1 |
| Learning Invariant Representations for Reinforcement Learning without Reconstruction | Jun 18, 2020 | Causal InferenceMuJoCo | CodeCode Available | 1 |
| Converting Biomechanical Models from OpenSim to MuJoCo | Jun 17, 2020 | MuJoCoreinforcement-learning | CodeCode Available | 1 |
| MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration | Jun 15, 2020 | Efficient ExplorationMeta Reinforcement Learning | CodeCode Available | 1 |
| Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration | Jun 5, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Delay-Aware Model-Based Reinforcement Learning for Continuous Control | May 11, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization | Apr 29, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| FACMAC: Factored Multi-Agent Centralised Policy Gradients | Mar 14, 2020 | MuJoCoMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| State-only Imitation with Transition Dynamics Mismatch | Feb 27, 2020 | Imitation LearningMuJoCo | CodeCode Available | 1 |
| Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors | Jan 9, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning | Oct 18, 2019 | Meta-LearningMuJoCo | CodeCode Available | 1 |
| Improving Sample Efficiency in Model-Free Reinforcement Learning from Images | Oct 2, 2019 | Image ReconstructionMuJoCo | CodeCode Available | 1 |
| Self-Supervised Exploration via Disagreement | Jun 10, 2019 | Active LearningEfficient Exploration | CodeCode Available | 1 |
| Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past | Jun 10, 2019 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 |
| SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards | May 27, 2019 | Imitation LearningMuJoCo | CodeCode Available | 1 |
| The StarCraft Multi-Agent Challenge | Feb 11, 2019 | BenchmarkingMuJoCo | CodeCode Available | 1 |
| Simple random search provides a competitive approach to reinforcement learning | Mar 19, 2018 | Computational Efficiencycontinuous-control | CodeCode Available | 1 |
| DeepMind Control Suite | Jan 2, 2018 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Learnings Options End-to-End for Continuous Action Tasks | Nov 30, 2017 | MuJoCo | CodeCode Available | 1 |
| Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation | Aug 17, 2017 | Atari Gamescontinuous-control | CodeCode Available | 1 |
| DART: Noise Injection for Robust Imitation Learning | Mar 27, 2017 | Imitation LearningMuJoCo | CodeCode Available | 1 |
| Evolution Strategies as a Scalable Alternative to Reinforcement Learning | Mar 10, 2017 | Atari GamesMuJoCo | CodeCode Available | 1 |
| Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback | Jul 17, 2025 | EEGMuJoCo | —Unverified | 0 |
| Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound | Jul 15, 2025 | counterfactualDecision Making | —Unverified | 0 |
| Safe Domain Randomization via Uncertainty-Aware Out-of-Distribution Detection and Policy Adaptation | Jul 8, 2025 | MuJoCoOut-of-Distribution Detection | —Unverified | 0 |
| Detecting and Mitigating Reward Hacking in Reinforcement Learning Systems: A Comprehensive Empirical Study | Jul 8, 2025 | MuJoCoRecommendation Systems | —Unverified | 0 |
| Generalized Adaptive Transfer Network: Enhancing Transfer Learning in Reinforcement Learning Across Domains | Jul 2, 2025 | Atari GamesChatbot | CodeCode Available | 0 |
| rQdia: Regularizing Q-Value Distributions With Image Augmentation | Jun 26, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration | Jun 25, 2025 | Imitation LearningMuJoCo | —Unverified | 0 |
| ADDQ: Adaptive Distributional Double Q-Learning | Jun 24, 2025 | Distributional Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning | Jun 24, 2025 | Meta Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Hard Contacts with Soft Gradients: Refining Differentiable Simulators for Learning and Control | Jun 17, 2025 | MuJoCo | —Unverified | 0 |
| The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning | Jun 16, 2025 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Wasserstein Barycenter Soft Actor-Critic | Jun 11, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Modular Recurrence in Contextual MDPs for Universal Morphology Control | Jun 10, 2025 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| MOBODY: Model Based Off-Dynamics Offline Reinforcement Learning | Jun 10, 2025 | Data Augmentationmodel | CodeCode Available | 0 |
| Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation | Jun 9, 2025 | Decision MakingMuJoCo | —Unverified | 0 |
| LLMs for sensory-motor control: Combining in-context and iterative learning | Jun 5, 2025 | MuJoCo | CodeCode Available | 0 |
| Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning | May 29, 2025 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Enhanced DACER Algorithm with High Diffusion Efficiency | May 29, 2025 | DenoisingImitation Learning | —Unverified | 0 |
| ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning | May 29, 2025 | DenoisingMuJoCo | —Unverified | 0 |