| Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization | Mar 28, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Optimal Transport for Offline Imitation Learning | Mar 24, 2023 | D4RLDecision Making | CodeCode Available | 1 |
| DataLight: Offline Data-Driven Traffic Signal Control | Mar 20, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning | Mar 9, 2023 | Offline RLQ-Learning | CodeCode Available | 1 |
| The In-Sample Softmax for Offline Reinforcement Learning | Feb 28, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Neural Laplace Control for Continuous-time Delayed Systems | Feb 24, 2023 | Model Predictive ControlOffline RL | CodeCode Available | 1 |
| Behavior Proximal Policy Optimization | Feb 22, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Swapped goal-conditioned offline reinforcement learning | Feb 17, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Dual RL: Unification and New Methods for Reinforcement and Imitation Learning | Feb 16, 2023 | Imitation LearningOffline RL | CodeCode Available | 1 |
| Direct Preference-based Policy Optimization without Reward Modeling | Jan 30, 2023 | Contrastive LearningOffline RL | CodeCode Available | 1 |
| Guiding Online Reinforcement Learning with Action-Free Offline Pretraining | Jan 30, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Extreme Q-Learning: MaxEnt RL without Entropy | Jan 5, 2023 | D4RLDeep Reinforcement Learning | CodeCode Available | 1 |
| Offline Reinforcement Learning for Visual Navigation | Dec 16, 2022 | NavigateOffline RL | CodeCode Available | 1 |
| One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning | Nov 30, 2022 | AllDecision Making | CodeCode Available | 1 |
| Efficient Reinforcement Learning Through Trajectory Generation | Nov 30, 2022 | LEMMAOffline RL | CodeCode Available | 1 |
| Masked Autoencoding for Scalable and Generalizable Decision Making | Nov 23, 2022 | Decision MakingOffline RL | CodeCode Available | 1 |
| Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows | Nov 20, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size | Nov 20, 2022 | Offline RL | CodeCode Available | 1 |
| Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information | Oct 31, 2022 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Leveraging Demonstrations with Latent Space Priors | Oct 26, 2022 | Offline RL | CodeCode Available | 1 |
| Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning | Oct 25, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| MoCoDA: Model-based Counterfactual Data Augmentation | Oct 20, 2022 | counterfactualData Augmentation | CodeCode Available | 1 |
| A Policy-Guided Imitation Approach for Offline Reinforcement Learning | Oct 15, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories | Oct 12, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| Efficient Offline Policy Optimization with a Learned Model | Oct 12, 2022 | Offline RL | CodeCode Available | 1 |
| Reliable Conditioning of Behavioral Cloning for Offline Reinforcement Learning | Oct 11, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials | Oct 11, 2022 | Offline RLQ-Learning | CodeCode Available | 1 |
| BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets | Oct 7, 2022 | Autonomous DrivingBackdoor Attack | CodeCode Available | 1 |
| VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training | Sep 30, 2022 | Offline RLOpen-Ended Question Answering | CodeCode Available | 1 |
| Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling | Sep 29, 2022 | Computational EfficiencyD4RL | CodeCode Available | 1 |
| Optimistic Curiosity Exploration and Conservative Exploitation with Linear Reward Shaping | Sep 15, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Efficient Planning in a Compact Latent Action Space | Aug 22, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| AdaCat: Adaptive Categorical Discretization for Autoregressive Models | Aug 3, 2022 | Density EstimationOffline RL | CodeCode Available | 1 |
| Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations | Jul 20, 2022 | Imitation LearningOffline RL | CodeCode Available | 1 |
| When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning | Jun 27, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Behavior Transformers: Cloning k modes with one stone | Jun 22, 2022 | Object DetectionOffline RL | CodeCode Available | 1 |
| Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning | Jun 9, 2022 | D4RLModel-based Reinforcement Learning | CodeCode Available | 1 |
| RORL: Robust Offline Reinforcement Learning via Conservative Smoothing | Jun 6, 2022 | Decision MakingOffline RL | CodeCode Available | 1 |
| When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning | May 23, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning | Apr 26, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation | Apr 19, 2022 | Offline RLOff-policy evaluation | CodeCode Available | 1 |
| Offline Reinforcement Learning for Safer Blood Glucose Control in People with Type 1 Diabetes | Apr 7, 2022 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| CIRS: Bursting Filter Bubbles by Counterfactual Interactive Recommender System | Apr 4, 2022 | Causal Inferencecounterfactual | CodeCode Available | 1 |
| Latent-Variable Advantage-Weighted Policy Optimization for Offline RL | Mar 16, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL | Feb 24, 2022 | AllImitation Learning | CodeCode Available | 1 |
| Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning | Feb 23, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| cosFormer: Rethinking Softmax in Attention | Feb 17, 2022 | D4RLLanguage Modeling | CodeCode Available | 1 |
| Supported Policy Optimization for Offline Reinforcement Learning | Feb 13, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL | Feb 9, 2022 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Adversarially Trained Actor Critic for Offline Reinforcement Learning | Feb 5, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |