| Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem | Dec 8, 2020 | Combinatorial OptimizationQ-Learning | CodeCode Available | 1 | 5 |
| Offline Reinforcement Learning with Implicit Q-Learning | Oct 12, 2021 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Conservative Q-Learning for Offline Reinforcement Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| On the Learning and Learnability of Quasimetrics | Jun 30, 2022 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 1 | 5 |
| Optimistic Exploration even with a Pessimistic Initialisation | Feb 26, 2020 | Efficient ExplorationQ-Learning | CodeCode Available | 1 | 5 |
| DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning | Feb 16, 2021 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 | 5 |
| Diffusion Policies creating a Trust Region for Offline Reinforcement Learning | May 30, 2024 | D4RLDenoising | CodeCode Available | 1 | 5 |
| PGDQN: Preference-Guided Deep Q-Network | Oct 3, 2023 | Atari GamesBenchmarking | CodeCode Available | 1 | 5 |
| Continuous Deep Q-Learning with Model-based Acceleration | Mar 2, 2016 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Playing Atari with Deep Reinforcement Learning | Dec 19, 2013 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Deep Reinforcement Learning with Double Q-learning | Sep 22, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning | Sep 22, 2023 | counterfactualMulti-agent Reinforcement Learning | CodeCode Available | 1 | 5 |
| Q-learning with Language Model for Edit-based Unsupervised Summarization | Oct 9, 2020 | Abstractive Text SummarizationDecoder | CodeCode Available | 1 | 5 |
| QPLEX: Duplex Dueling Multi-Agent Q-Learning | Aug 3, 2020 | Decision MakingMulti-agent Reinforcement Learning | CodeCode Available | 1 | 5 |
| Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning | Jul 12, 2022 | Lifelong learningPolicy Gradient Methods | CodeCode Available | 1 | 5 |
| Reasoning with Latent Diffusion in Offline Reinforcement Learning | Sep 12, 2023 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Reinforced Lin-Kernighan-Helsgaun Algorithms for the Traveling Salesman Problems | Jul 8, 2022 | Combinatorial OptimizationQ-Learning | CodeCode Available | 1 | 5 |
| Deep Active Inference for Partially Observable MDPs | Sep 8, 2020 | Deep Reinforcement LearningQ-Learning | CodeCode Available | 1 | 5 |
| Revisiting Discrete Soft Actor-Critic | Sep 21, 2022 | Atari GamesQ-Learning | CodeCode Available | 1 | 5 |
| FACMAC: Factored Multi-Agent Centralised Policy Gradients | Mar 14, 2020 | MuJoCoMulti-agent Reinforcement Learning | CodeCode Available | 1 | 5 |
| Reward Machines for Cooperative Multi-Agent Reinforcement Learning | Jul 3, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 | 5 |
| Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? | Dec 1, 2020 | Feature EngineeringQ-Learning | CodeCode Available | 1 | 5 |
| Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial Detection | Sep 29, 2021 | Q-LearningTraffic Signal Control | CodeCode Available | 1 | 5 |
| Robust Q-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty | Sep 30, 2022 | Q-Learning | CodeCode Available | 1 | 5 |
| HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation | May 4, 2021 | Bayesian OptimizationQ-Learning | CodeCode Available | 1 | 5 |