| Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments | Jun 7, 2017 | Deep Reinforcement LearningMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Multi-Agent Collaboration via Reward Attribution Decomposition | Oct 16, 2020 | Dota 2Multi-agent Reinforcement Learning | CodeCode Available | 1 |
| Multi-Agent Determinantal Q-Learning | Jun 2, 2020 | Q-Learning | CodeCode Available | 1 |
| Multi-Agent Reinforcement Learning via Distributed MPC as a Function Approximator | Dec 8, 2023 | Model Predictive ControlMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Deep Reinforcement Learning with Double Q-learning | Sep 22, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| Offline Reinforcement Learning with Implicit Q-Learning | Oct 12, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| Continuous Deep Q-Learning with Model-based Acceleration | Mar 2, 2016 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization | Mar 28, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Optimistic Exploration even with a Pessimistic Initialisation | Feb 26, 2020 | Efficient ExplorationQ-Learning | CodeCode Available | 1 |
| Optimistic Multi-Agent Policy Gradient | Nov 3, 2023 | MuJoCoQ-Learning | CodeCode Available | 1 |
| PGDQN: Preference-Guided Deep Q-Network | Oct 3, 2023 | Atari GamesBenchmarking | CodeCode Available | 1 |
| PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer | Jun 10, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| When should we prefer Decision Transformers for Offline Reinforcement Learning? | May 23, 2023 | D4RLImitation Learning | CodeCode Available | 1 |
| Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning | Sep 22, 2023 | counterfactualMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial Detection | Sep 29, 2021 | Q-LearningTraffic Signal Control | CodeCode Available | 1 |
| Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning | Mar 9, 2023 | Offline RLQ-Learning | CodeCode Available | 1 |
| Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past | Jun 10, 2019 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 |
| Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? | Dec 1, 2020 | Feature EngineeringQ-Learning | CodeCode Available | 1 |
| Addressing Function Approximation Error in Actor-Critic Methods | Feb 26, 2018 | Continuous ControlOpenAI Gym | CodeCode Available | 1 |
| Benchmarking Deep Graph Generative Models for Optimizing New Drug Molecules for COVID-19 | Feb 9, 2021 | BenchmarkingQ-Learning | CodeCode Available | 1 |
| Boosting Continuous Control with Consistency Policy | Oct 10, 2023 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem | Dec 8, 2020 | Combinatorial OptimizationQ-Learning | CodeCode Available | 1 |
| Conservative Q-Learning for Offline Reinforcement Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Continuous control with deep reinforcement learning | Sep 9, 2015 | Action Detectioncontinuous-control | CodeCode Available | 1 |
| Deep Active Inference for Partially Observable MDPs | Sep 8, 2020 | Deep Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Deep Inverse Q-learning with Constraints | Aug 4, 2020 | Q-Learning | CodeCode Available | 1 |
| Acting in Delayed Environments with Non-Stationary Markov Policies | Jan 28, 2021 | Cloud ComputingQ-Learning | CodeCode Available | 1 |
| Deep Recurrent Q-Learning for Partially Observable MDPs | Jul 23, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement Learning | May 2, 2022 | Data AugmentationQ-Learning | CodeCode Available | 1 |
| A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games | Jul 18, 2022 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| Backprop-Free Reinforcement Learning with Active Neural Generative Coding | Jul 10, 2021 | Q-Learningreinforcement-learning | CodeCode Available | 1 |
| DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction | Mar 16, 2020 | Deep Reinforcement LearningMeta-Learning | CodeCode Available | 1 |
| Automated Cloud Provisioning on AWS using Deep Reinforcement Learning | Sep 13, 2017 | Cloud ComputingDeep Reinforcement Learning | CodeCode Available | 1 |
| Dropout Q-Functions for Doubly Efficient Reinforcement Learning | Oct 5, 2021 | Computational EfficiencyQ-Learning | CodeCode Available | 1 |
| Energy-based Surprise Minimization for Multi-Agent Value Factorization | Sep 16, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning | Jun 7, 2021 | Multi-agent Reinforcement LearningOffline RL | CodeCode Available | 1 |
| A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning | May 27, 2024 | Data AugmentationQ-Learning | CodeCode Available | 1 |
| FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques | Mar 21, 2020 | Q-Learningreinforcement-learning | CodeCode Available | 1 |
| Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning | Oct 30, 2023 | Decision MakingOffline RL | CodeCode Available | 1 |
| GAIL-PT: A Generic Intelligent Penetration Testing Framework with Generative Adversarial Imitation Learning | Apr 5, 2022 | Imitation LearningQ-Learning | CodeCode Available | 1 |
| HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation | May 4, 2021 | Bayesian OptimizationQ-Learning | CodeCode Available | 1 |
| Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient | Oct 13, 2022 | Montezuma's RevengeQ-Learning | CodeCode Available | 1 |
| Image Classification by Reinforcement Learning with Two-State Q-Learning | Jun 28, 2020 | ClassificationGeneral Classification | CodeCode Available | 1 |
| Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? | Sep 26, 2019 | Feature EngineeringQ-Learning | CodeCode Available | 1 |
| Learning the Markov Decision Process in the Sparse Gaussian Elimination | Sep 30, 2021 | Combinatorial OptimizationQ-Learning | CodeCode Available | 1 |
| LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning | Mar 1, 2023 | Continuous ControlImitation Learning | CodeCode Available | 1 |
| MAN: Multi-Action Networks Learning | Sep 19, 2022 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer | Jun 20, 2022 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| An Optimistic Perspective on Offline Deep Reinforcement Learning | Jan 1, 2020 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| A Stochastic Game Framework for Efficient Energy Management in Microgrid Networks | Feb 6, 2020 | energy managementenergy trading | CodeCode Available | 1 |