| ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy | Feb 8, 2025 | Q-LearningSafe Exploration | CodeCode Available | 3 |
| Streaming Deep Reinforcement Learning Finally Works | Oct 18, 2024 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 3 |
| Simplifying Deep Temporal Difference Learning | Jul 5, 2024 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 3 |
| Flow Q-Learning | Feb 4, 2025 | Action GenerationD4RL | CodeCode Available | 3 |
| Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding | Mar 12, 2024 | Multi-Agent Path FindingMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| Safe Multi-Agent Reinforcement Learning with Bilevel Optimization in Autonomous Driving | May 28, 2024 | Autonomous DrivingBilevel Optimization | CodeCode Available | 2 |
| Offline RL for Natural Language Generation with Implicit Language Q Learning | Jun 5, 2022 | Language ModellingOffline RL | CodeCode Available | 2 |
| Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather | Jul 2, 2024 | Data AugmentationLIDAR Semantic Segmentation | CodeCode Available | 2 |
| Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning | Aug 12, 2022 | D4RLOffline RL | CodeCode Available | 2 |
| Digi-Q: Learning Q-Value Functions for Training Device-Control Agents | Feb 13, 2025 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 2 |
| ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency | Nov 29, 2022 | Decision MakingMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| Pretrained LLM Adapted with LoRA as a Decision Transformer for Offline RL in Quantitative Trading | Nov 26, 2024 | Offline RLparameter-efficient fine-tuning | CodeCode Available | 2 |
| rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch | Sep 3, 2019 | Deep Reinforcement LearningQ-Learning | CodeCode Available | 2 |
| Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning | Mar 2, 2024 | DecoderMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological Models | Oct 9, 2020 | Deep Reinforcement LearningEpidemiology | CodeCode Available | 1 |
| Distributed Heuristic Multi-Agent Path Finding with Communication | Jun 21, 2021 | Multi-Agent Path FindingQ-Learning | CodeCode Available | 1 |
| Evolution Strategies as a Scalable Alternative to Reinforcement Learning | Mar 10, 2017 | Atari GamesMuJoCo | CodeCode Available | 1 |
| Discriminator Soft Actor Critic without Extrinsic Rewards | Jan 19, 2020 | Imitation LearningQ-Learning | CodeCode Available | 1 |
| Diffusion Policies creating a Trust Region for Offline Reinforcement Learning | May 30, 2024 | D4RLDenoising | CodeCode Available | 1 |
| Distilling Reinforcement Learning Tricks for Video Games | Jul 1, 2021 | Q-Learningreinforcement-learning | CodeCode Available | 1 |
| Extreme Q-Learning: MaxEnt RL without Entropy | Jan 5, 2023 | D4RLDeep Reinforcement Learning | CodeCode Available | 1 |
| Deep Inverse Q-learning with Constraints | Aug 4, 2020 | Q-Learning | CodeCode Available | 1 |
| DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning | Feb 16, 2021 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Continuous Deep Q-Learning with Model-based Acceleration | Mar 2, 2016 | continuous-controlContinuous Control | CodeCode Available | 1 |
| FACMAC: Factored Multi-Agent Centralised Policy Gradients | Mar 14, 2020 | MuJoCoMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction | Mar 16, 2020 | Deep Reinforcement LearningMeta-Learning | CodeCode Available | 1 |
| Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation | Jun 23, 2021 | Continuous ControlQ-Learning | CodeCode Available | 1 |
| Dropout Q-Functions for Doubly Efficient Reinforcement Learning | Oct 5, 2021 | Computational EfficiencyQ-Learning | CodeCode Available | 1 |
| Energy-based Surprise Minimization for Multi-Agent Value Factorization | Sep 16, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement Learning | May 2, 2022 | Data AugmentationQ-Learning | CodeCode Available | 1 |
| Conservative Q-Learning for Offline Reinforcement Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Deep Recurrent Q-Learning for Partially Observable MDPs | Jul 23, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| Benchmarking Deep Graph Generative Models for Optimizing New Drug Molecules for COVID-19 | Feb 9, 2021 | BenchmarkingQ-Learning | CodeCode Available | 1 |
| Backprop-Free Reinforcement Learning with Active Neural Generative Coding | Jul 10, 2021 | Q-Learningreinforcement-learning | CodeCode Available | 1 |
| When should we prefer Decision Transformers for Offline Reinforcement Learning? | May 23, 2023 | D4RLImitation Learning | CodeCode Available | 1 |
| Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning | Jun 7, 2021 | Multi-agent Reinforcement LearningOffline RL | CodeCode Available | 1 |
| Benchmarking Batch Deep Reinforcement Learning Algorithms | Oct 3, 2019 | BenchmarkingDeep Reinforcement Learning | CodeCode Available | 1 |
| Acting in Delayed Environments with Non-Stationary Markov Policies | Jan 28, 2021 | Cloud ComputingQ-Learning | CodeCode Available | 1 |
| Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning | Mar 9, 2023 | Offline RLQ-Learning | CodeCode Available | 1 |
| Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? | Dec 1, 2020 | Feature EngineeringQ-Learning | CodeCode Available | 1 |
| Boosting Continuous Control with Consistency Policy | Oct 10, 2023 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem | Dec 8, 2020 | Combinatorial OptimizationQ-Learning | CodeCode Available | 1 |
| Reinforcement Learning in High-frequency Market Making | Jul 14, 2024 | Q-Learningreinforcement-learning | CodeCode Available | 1 |
| Continuous control with deep reinforcement learning | Sep 9, 2015 | Action Detectioncontinuous-control | CodeCode Available | 1 |
| Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning | Sep 22, 2023 | counterfactualMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Deep Active Inference for Partially Observable MDPs | Sep 8, 2020 | Deep Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Deep Reinforcement Learning with Double Q-learning | Sep 22, 2015 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial Detection | Sep 29, 2021 | Q-LearningTraffic Signal Control | CodeCode Available | 1 |
| A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning | May 27, 2024 | Data AugmentationQ-Learning | CodeCode Available | 1 |
| An Optimistic Perspective on Offline Deep Reinforcement Learning | Jan 1, 2020 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |