| Neural Interactive Collaborative Filtering | Jul 4, 2020 | Collaborative FilteringMeta-Learning | CodeCode Available | 1 | 5 |
| Offline Reinforcement Learning with Implicit Q-Learning | Oct 12, 2021 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial Detection | Sep 29, 2021 | Q-LearningTraffic Signal Control | CodeCode Available | 1 | 5 |
| Conservative Q-Learning for Offline Reinforcement Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Continuous control with deep reinforcement learning | Sep 9, 2015 | Action Detectioncontinuous-control | CodeCode Available | 1 | 5 |
| Optimal Market Making by Reinforcement Learning | Apr 8, 2021 | Q-Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| Deep Reinforcement Learning with Double Q-learning | Sep 22, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Optimization of Molecules via Deep Reinforcement Learning | Oct 19, 2018 | Deep Reinforcement LearningMolecular Graph Generation | CodeCode Available | 1 | 5 |
| PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer | Jun 10, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Playing Atari with Deep Reinforcement Learning | Dec 19, 2013 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning | Feb 16, 2021 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 | 5 |
| Q-learning with Language Model for Edit-based Unsupervised Summarization | Oct 9, 2020 | Abstractive Text SummarizationDecoder | CodeCode Available | 1 | 5 |
| Towards Universal and Black-Box Query-Response Only Attack on LLMs with QROA | Jun 4, 2024 | Q-Learning | CodeCode Available | 1 | 5 |
| Randomized Ensembled Double Q-Learning: Learning Fast Without a Model | Jan 15, 2021 | MuJoCoQ-Learning | CodeCode Available | 1 | 5 |
| FACMAC: Factored Multi-Agent Centralised Policy Gradients | Mar 14, 2020 | MuJoCoMulti-agent Reinforcement Learning | CodeCode Available | 1 | 5 |
| Deep Active Inference for Partially Observable MDPs | Sep 8, 2020 | Deep Reinforcement LearningQ-Learning | CodeCode Available | 1 | 5 |
| Reinforced Lin-Kernighan-Helsgaun Algorithms for the Traveling Salesman Problems | Jul 8, 2022 | Combinatorial OptimizationQ-Learning | CodeCode Available | 1 | 5 |
| Deep Inverse Q-learning with Constraints | Aug 4, 2020 | Q-Learning | CodeCode Available | 1 | 5 |
| Revisiting Discrete Soft Actor-Critic | Sep 21, 2022 | Atari GamesQ-Learning | CodeCode Available | 1 | 5 |
| Reward-free World Models for Online Imitation Learning | Oct 17, 2024 | Imitation LearningQ-Learning | CodeCode Available | 1 | 5 |
| DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction | Mar 16, 2020 | Deep Reinforcement LearningMeta-Learning | CodeCode Available | 1 | 5 |
| Robust Deep Reinforcement Learning through Adversarial Loss | Aug 5, 2020 | Adversarial AttackAtari Games | CodeCode Available | 1 | 5 |
| Energy-based Surprise Minimization for Multi-Agent Value Factorization | Sep 16, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 | 5 |
| Deep Recurrent Q-Learning for Partially Observable MDPs | Jul 23, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient | Oct 13, 2022 | Montezuma's RevengeQ-Learning | CodeCode Available | 1 | 5 |