| Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning | Jul 10, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| Alleviating Matthew Effect of Offline Reinforcement Learning in Interactive Recommendation | Jul 10, 2023 | Decision MakingInteractive Recommendation | CodeCode Available | 1 |
| Goal-Conditioned Predictive Coding for Offline Reinforcement Learning | Jul 7, 2023 | Decision MakingOffline RL | —Unverified | 0 |
| Offline Reinforcement Learning with Imbalanced Datasets | Jul 6, 2023 | D4RLOffline RL | —Unverified | 0 |
| LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning | Jul 5, 2023 | Offline RLQ-Learning | —Unverified | 0 |
| Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning | Jul 1, 2023 | D4RLmodel | CodeCode Available | 1 |
| Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning | Jun 27, 2023 | D4RLOffline RL | —Unverified | 0 |
| Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization | Jun 26, 2023 | Offline RLTest-time Adaptation | —Unverified | 0 |
| ChiPFormer: Transferable Chip Placement via Offline Decision Transformer | Jun 26, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching | Jun 24, 2023 | Imitation LearningOffline RL | —Unverified | 0 |
| Offline Policy Evaluation for Reinforcement Learning with Adaptively Collected Data | Jun 24, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| CLUE: Calibrated Latent Guidance for Offline Reinforcement Learning | Jun 23, 2023 | Imitation LearningOffline RL | —Unverified | 0 |
| Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting | Jun 22, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning | Jun 22, 2023 | Data AugmentationOffline RL | CodeCode Available | 1 |
| Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap | Jun 20, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| 2vec: Policy Representations with Successor Features | Jun 16, 2023 | Offline RL | —Unverified | 0 |
| Automatic Trade-off Adaptation in Offline RL | Jun 16, 2023 | Offline RL | —Unverified | 0 |
| Semi-Offline Reinforcement Learning for Optimized Text Generation | Jun 16, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization | Jun 15, 2023 | ManagementMulti-agent Reinforcement Learning | —Unverified | 0 |
| Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources | Jun 14, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Off-policy Evaluation in Doubly Inhomogeneous Environments | Jun 14, 2023 | Offline RLOff-policy evaluation | CodeCode Available | 0 |
| Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective | Jun 13, 2023 | Learning-To-RankOffline RL | CodeCode Available | 0 |
| Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care | Jun 13, 2023 | Offline RLQ-Learning | —Unverified | 0 |
| A Simple Unified Uncertainty-Guided Framework for Offline-to-Online Reinforcement Learning | Jun 13, 2023 | D4RLEfficient Exploration | —Unverified | 0 |
| ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles | Jun 12, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Policy Regularization with Dataset Constraint for Offline Reinforcement Learning | Jun 11, 2023 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Iteratively Refined Behavior Regularization for Offline Reinforcement Learning | Jun 9, 2023 | D4RLOffline RL | —Unverified | 0 |
| Instructed Diffuser with Temporal Condition Guidance for Offline Reinforcement Learning | Jun 8, 2023 | Decision MakingOffline RL | —Unverified | 0 |
| Decoupled Prioritized Resampling for Offline RL | Jun 8, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline RL | Jun 7, 2023 | Data AugmentationOffline RL | CodeCode Available | 1 |
| Mildly Constrained Evaluation Policy for Offline Reinforcement Learning | Jun 6, 2023 | D4RLMuJoCo | CodeCode Available | 0 |
| PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation | Jun 6, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| State Regularized Policy Optimization on Data with Dynamics Shift | Jun 6, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Survival Instinct in Offline Reinforcement Learning | Jun 5, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding | Jun 1, 2023 | ManagementOffline RL | —Unverified | 0 |
| Improving and Benchmarking Offline Reinforcement Learning Algorithms | Jun 1, 2023 | AttributeBenchmarking | CodeCode Available | 1 |
| Improving Offline RL by Blending Heuristics | Jun 1, 2023 | D4RLOffline RL | —Unverified | 0 |
| IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control | Jun 1, 2023 | D4RLModel-based Reinforcement Learning | —Unverified | 0 |
| Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning | Jun 1, 2023 | FairnessOffline RL | —Unverified | 0 |
| Efficient Diffusion Policies for Offline Reinforcement Learning | May 31, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation | May 31, 2023 | D4RLLanguage Modelling | CodeCode Available | 1 |
| What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL? | May 30, 2023 | Imitation LearningOffline RL | CodeCode Available | 0 |
| Robust Reinforcement Learning Objectives for Sequential Recommender Systems | May 30, 2023 | Offline RLRecommendation Systems | CodeCode Available | 0 |
| Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism | May 29, 2023 | Decision MakingEconometrics | —Unverified | 0 |
| MADiff: Offline Multi-agent Learning with Diffusion Models | May 27, 2023 | Offline RLQ-Learning | CodeCode Available | 1 |
| The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning | May 25, 2023 | Distributional Reinforcement LearningOffline RL | CodeCode Available | 0 |
| Beyond Reward: Offline Preference-guided Policy Optimization | May 25, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning | May 24, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models | May 24, 2023 | Language ModellingOffline RL | CodeCode Available | 1 |
| When should we prefer Decision Transformers for Offline Reinforcement Learning? | May 23, 2023 | D4RLImitation Learning | CodeCode Available | 1 |