| Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning | Oct 30, 2023 | Decision MakingOffline RL | CodeCode Available | 1 | 5 |
| Research on Robot Path Planning Based on Reinforcement Learning | Apr 22, 2024 | Q-Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| Reward Machines for Cooperative Multi-Agent Reinforcement Learning | Jul 3, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 | 5 |
| Extreme Q-Learning: MaxEnt RL without Entropy | Jan 5, 2023 | D4RLDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| GAIL-PT: A Generic Intelligent Penetration Testing Framework with Generative Adversarial Imitation Learning | Apr 5, 2022 | Imitation LearningQ-Learning | CodeCode Available | 1 | 5 |
| Robust Q-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty | Sep 30, 2022 | Q-Learning | CodeCode Available | 1 | 5 |
| An Optimistic Perspective on Offline Deep Reinforcement Learning | Jan 1, 2020 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| A Search-Based Testing Approach for Deep Reinforcement Learning Agents | Jun 15, 2022 | Autonomous DrivingDecision Making | CodeCode Available | 1 | 5 |
| Deep Recurrent Q-Learning for Partially Observable MDPs | Jul 23, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| SHAQ: Incorporating Shapley Value Theory into Multi-Agent Q-Learning | May 31, 2021 | FairnessMulti-agent Reinforcement Learning | CodeCode Available | 1 | 5 |
| Adaptive Contention Window Design using Deep Q-learning | Nov 18, 2020 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 1 | 5 |
| Deep Reinforcement Learning with Double Q-learning | Sep 22, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Spatial Action Maps for Mobile Manipulation | Apr 20, 2020 | Q-LearningValue prediction | CodeCode Available | 1 | 5 |
| Split Q Learning: Reinforcement Learning with Two-Stream Rewards | Jun 21, 2019 | Decision MakingQ-Learning | CodeCode Available | 1 | 5 |
| Dropout Q-Functions for Doubly Efficient Reinforcement Learning | Oct 5, 2021 | Computational EfficiencyQ-Learning | CodeCode Available | 1 | 5 |
| Strategically Conservative Q-Learning | Jun 6, 2024 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| An Optimistic Perspective on Offline Reinforcement Learning | Jul 10, 2019 | Atari GamesDiversity | CodeCode Available | 1 | 5 |
| SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning | Jul 9, 2020 | Deep Reinforcement LearningDiversity | CodeCode Available | 1 | 5 |
| Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial Detection | Sep 29, 2021 | Q-LearningTraffic Signal Control | CodeCode Available | 1 | 5 |
| DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning | Feb 16, 2021 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 | 5 |
| Efficient (Soft) Q-Learning for Text Generation with Limited Good Data | Jun 14, 2021 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 1 | 5 |
| Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error | Feb 3, 2024 | Adversarial RobustnessDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Diffusion Policies creating a Trust Region for Offline Reinforcement Learning | May 30, 2024 | D4RLDenoising | CodeCode Available | 1 | 5 |
| CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement Learning | May 2, 2022 | Data AugmentationQ-Learning | CodeCode Available | 1 | 5 |
| EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological Models | Oct 9, 2020 | Deep Reinforcement LearningEpidemiology | CodeCode Available | 1 | 5 |
| DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction | Mar 16, 2020 | Deep Reinforcement LearningMeta-Learning | CodeCode Available | 1 | 5 |
| Gradient Temporal-Difference Learning with Regularized Corrections | Jul 1, 2020 | Q-Learning | CodeCode Available | 1 | 5 |
| Learning the Markov Decision Process in the Sparse Gaussian Elimination | Sep 30, 2021 | Combinatorial OptimizationQ-Learning | CodeCode Available | 1 | 5 |
| Multi-Agent Determinantal Q-Learning | Jun 2, 2020 | Q-Learning | CodeCode Available | 1 | 5 |
| Energy-based Surprise Minimization for Multi-Agent Value Factorization | Sep 16, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 | 5 |
| A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning | May 27, 2024 | Data AugmentationQ-Learning | CodeCode Available | 1 | 5 |
| Evolution Strategies as a Scalable Alternative to Reinforcement Learning | Mar 10, 2017 | Atari GamesMuJoCo | CodeCode Available | 1 | 5 |
| FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques | Mar 21, 2020 | Q-Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| Randomized Ensembled Double Q-Learning: Learning Fast Without a Model | Jan 15, 2021 | MuJoCoQ-Learning | CodeCode Available | 1 | 5 |
| Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls | Oct 27, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| A Stochastic Game Framework for Efficient Energy Management in Microgrid Networks | Feb 6, 2020 | energy managementenergy trading | CodeCode Available | 1 | 5 |
| Addressing Function Approximation Error in Actor-Critic Methods | Feb 26, 2018 | Continuous ControlOpenAI Gym | CodeCode Available | 1 | 5 |
| Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient | Oct 13, 2022 | Montezuma's RevengeQ-Learning | CodeCode Available | 1 | 5 |
| Automated Cloud Provisioning on AWS using Deep Reinforcement Learning | Sep 13, 2017 | Cloud ComputingDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Backprop-Free Reinforcement Learning with Active Neural Generative Coding | Jul 10, 2021 | Q-Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| IQ-Learn: Inverse soft-Q Learning for Imitation | Jun 23, 2021 | Atari GamesContinuous Control | CodeCode Available | 1 | 5 |
| Is Q-learning Provably Efficient? | Jul 10, 2018 | Q-LearningReinforcement Learning | CodeCode Available | 1 | 5 |
| When should we prefer Decision Transformers for Offline Reinforcement Learning? | May 23, 2023 | D4RLImitation Learning | CodeCode Available | 1 | 5 |
| Benchmarking Batch Deep Reinforcement Learning Algorithms | Oct 3, 2019 | BenchmarkingDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning | Jun 7, 2021 | Multi-agent Reinforcement LearningOffline RL | CodeCode Available | 1 | 5 |
| MADiff: Offline Multi-agent Learning with Diffusion Models | May 27, 2023 | Offline RLQ-Learning | CodeCode Available | 1 | 5 |
| Benchmarking Deep Graph Generative Models for Optimizing New Drug Molecules for COVID-19 | Feb 9, 2021 | BenchmarkingQ-Learning | CodeCode Available | 1 | 5 |
| Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past | Jun 10, 2019 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 1 | 5 |
| Boosting Continuous Control with Consistency Policy | Oct 10, 2023 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning | May 17, 2021 | Offline RLQ-Learning | CodeCode Available | 1 | 5 |