| Q-learning with Language Model for Edit-based Unsupervised Summarization | Oct 9, 2020 | Abstractive Text SummarizationDecoder | CodeCode Available | 1 |
| EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological Models | Oct 9, 2020 | Deep Reinforcement LearningEpidemiology | CodeCode Available | 1 |
| Energy-based Surprise Minimization for Multi-Agent Value Factorization | Sep 16, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Deep Active Inference for Partially Observable MDPs | Sep 8, 2020 | Deep Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Table2Charts: Recommending Charts by Learning Shared Table Representations | Aug 24, 2020 | Q-LearningRecommendation Systems | CodeCode Available | 1 |
| Robust Deep Reinforcement Learning through Adversarial Loss | Aug 5, 2020 | Adversarial AttackAtari Games | CodeCode Available | 1 |
| Deep Inverse Q-learning with Constraints | Aug 4, 2020 | Q-Learning | CodeCode Available | 1 |
| QPLEX: Duplex Dueling Multi-Agent Q-Learning | Aug 3, 2020 | Decision MakingMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning | Jul 9, 2020 | Deep Reinforcement LearningDiversity | CodeCode Available | 1 |
| Neural Interactive Collaborative Filtering | Jul 4, 2020 | Collaborative FilteringMeta-Learning | CodeCode Available | 1 |
| Reward Machines for Cooperative Multi-Agent Reinforcement Learning | Jul 3, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Gradient Temporal-Difference Learning with Regularized Corrections | Jul 1, 2020 | Q-Learning | CodeCode Available | 1 |
| Image Classification by Reinforcement Learning with Two-State Q-Learning | Jun 28, 2020 | ClassificationGeneral Classification | CodeCode Available | 1 |
| Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning | Jun 18, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Semantic Visual Navigation by Watching YouTube Videos | Jun 17, 2020 | Q-LearningVisual Navigation | CodeCode Available | 1 |
| Conservative Q-Learning for Offline Reinforcement Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Multi-Agent Determinantal Q-Learning | Jun 2, 2020 | Q-Learning | CodeCode Available | 1 |
| Modeling Penetration Testing with Reinforcement Learning Using Capture-the-Flag Challenges: Trade-offs between Model-free Learning and A Priori Knowledge | May 26, 2020 | Q-Learningreinforcement-learning | CodeCode Available | 1 |
| Spatial Action Maps for Mobile Manipulation | Apr 20, 2020 | Q-LearningValue prediction | CodeCode Available | 1 |
| Using Deep Reinforcement Learning Methods for Autonomous Vessels in 2D Environments | Mar 23, 2020 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 1 |
| FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques | Mar 21, 2020 | Q-Learningreinforcement-learning | CodeCode Available | 1 |
| DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction | Mar 16, 2020 | Deep Reinforcement LearningMeta-Learning | CodeCode Available | 1 |
| FACMAC: Factored Multi-Agent Centralised Policy Gradients | Mar 14, 2020 | MuJoCoMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Optimistic Exploration even with a Pessimistic Initialisation | Feb 26, 2020 | Efficient ExplorationQ-Learning | CodeCode Available | 1 |
| Maxmin Q-learning: Controlling the Estimation Bias of Q-learning | Feb 16, 2020 | Q-Learning | CodeCode Available | 1 |
| A Stochastic Game Framework for Efficient Energy Management in Microgrid Networks | Feb 6, 2020 | energy managementenergy trading | CodeCode Available | 1 |
| Discriminator Soft Actor Critic without Extrinsic Rewards | Jan 19, 2020 | Imitation LearningQ-Learning | CodeCode Available | 1 |
| An Optimistic Perspective on Offline Deep Reinforcement Learning | Jan 1, 2020 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| Benchmarking Batch Deep Reinforcement Learning Algorithms | Oct 3, 2019 | BenchmarkingDeep Reinforcement Learning | CodeCode Available | 1 |
| Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? | Sep 26, 2019 | Feature EngineeringQ-Learning | CodeCode Available | 1 |
| ModelicaGym: Applying Reinforcement Learning to Modelica Models | Sep 18, 2019 | Q-Learningreinforcement-learning | CodeCode Available | 1 |
| An Optimistic Perspective on Offline Reinforcement Learning | Jul 10, 2019 | Atari GamesDiversity | CodeCode Available | 1 |
| A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry | Jun 21, 2019 | Decision MakingLifelong learning | CodeCode Available | 1 |
| Split Q Learning: Reinforcement Learning with Two-Stream Rewards | Jun 21, 2019 | Decision MakingQ-Learning | CodeCode Available | 1 |
| Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past | Jun 10, 2019 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 |
| SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards | May 27, 2019 | Imitation LearningMuJoCo | CodeCode Available | 1 |
| Optimization of Molecules via Deep Reinforcement Learning | Oct 19, 2018 | Deep Reinforcement LearningMolecular Graph Generation | CodeCode Available | 1 |
| Negative Update Intervals in Deep Multi-Agent Reinforcement Learning | Sep 13, 2018 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Is Q-learning Provably Efficient? | Jul 10, 2018 | Q-LearningReinforcement Learning | CodeCode Available | 1 |
| Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning | Mar 27, 2018 | Deep Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Addressing Function Approximation Error in Actor-Critic Methods | Feb 26, 2018 | Continuous ControlOpenAI Gym | CodeCode Available | 1 |
| Mean Field Multi-Agent Reinforcement Learning | Feb 15, 2018 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor | Jan 4, 2018 | Continuous ControlDecision Making | CodeCode Available | 1 |
| Automated Cloud Provisioning on AWS using Deep Reinforcement Learning | Sep 13, 2017 | Cloud ComputingDeep Reinforcement Learning | CodeCode Available | 1 |
| Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments | Jun 7, 2017 | Deep Reinforcement LearningMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Evolution Strategies as a Scalable Alternative to Reinforcement Learning | Mar 10, 2017 | Atari GamesMuJoCo | CodeCode Available | 1 |
| Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning | Feb 28, 2017 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Continuous Deep Q-Learning with Model-based Acceleration | Mar 2, 2016 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Multiagent Cooperation and Competition with Deep Reinforcement Learning | Nov 27, 2015 | Deep Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Deep Reinforcement Learning with Double Q-learning | Sep 22, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |