| On the Role of Discount Factor in Offline Reinforcement Learning | Jun 7, 2022 | D4RLOffline RL | —Unverified | 0 |
| RORL: Robust Offline Reinforcement Learning via Conservative Smoothing | Jun 6, 2022 | Decision MakingOffline RL | CodeCode Available | 1 |
| Offline RL for Natural Language Generation with Implicit Language Q Learning | Jun 5, 2022 | Language ModellingOffline RL | CodeCode Available | 2 |
| Offline Reinforcement Learning with Causal Structured World Models | Jun 3, 2022 | Model-based Reinforcement LearningOffline RL | —Unverified | 0 |
| Offline Reinforcement Learning with Differential Privacy | Jun 2, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| Model Generation with Provable Coverability for Offline Reinforcement Learning | Jun 1, 2022 | Offline RLOut-of-Distribution Generalization | —Unverified | 0 |
| Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL | Jun 1, 2022 | D4RLOffline RL | —Unverified | 0 |
| Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game | May 31, 2022 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments | May 31, 2022 | Offline RLPlaying the Game of 2048 | —Unverified | 0 |
| Multi-Game Decision Transformers | May 30, 2022 | Atari GamesOffline RL | CodeCode Available | 0 |
| Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters | May 27, 2022 | D4RLOffline RL | —Unverified | 0 |
| Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes | May 26, 2022 | Causal InferenceOffline RL | —Unverified | 0 |
| When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning | May 23, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| User-Interactive Offline Reinforcement Learning | May 21, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation | May 6, 2022 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning | May 5, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers | Apr 28, 2022 | Decision MakingOffline RL | —Unverified | 0 |
| RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning | Apr 26, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Learning Value Functions from Undirected State-only Experience | Apr 26, 2022 | Future predictionImitation Learning | —Unverified | 0 |
| COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation | Apr 19, 2022 | Offline RLOff-policy evaluation | CodeCode Available | 1 |
| CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning | Apr 18, 2022 | ChatbotOffline RL | CodeCode Available | 2 |
| When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? | Apr 12, 2022 | Atari GamesDiagnostic | —Unverified | 0 |
| Settling the Sample Complexity of Model-Based Offline Reinforcement Learning | Apr 11, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| Offline Reinforcement Learning for Safer Blood Glucose Control in People with Type 1 Diabetes | Apr 7, 2022 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| CIRS: Bursting Filter Bubbles by Counterfactual Interactive Recommender System | Apr 4, 2022 | Causal Inferencecounterfactual | CodeCode Available | 1 |
| Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps | Mar 25, 2022 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| A Conservative Q-Learning approach for handling distribution shift in sepsis treatment strategies | Mar 25, 2022 | Deep Reinforcement LearningOffline RL | —Unverified | 0 |
| Bellman Residual Orthogonalization for Offline Reinforcement Learning | Mar 24, 2022 | Offline RLOff-policy evaluation | —Unverified | 0 |
| Optimizing Trajectories for Highway Driving with Offline Reinforcement Learning | Mar 21, 2022 | Autonomous DrivingOffline RL | —Unverified | 0 |
| Semi-Markov Offline Reinforcement Learning for Healthcare | Mar 17, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks | Mar 16, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| Latent-Variable Advantage-Weighted Policy Optimization for Offline RL | Mar 16, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning | Mar 13, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical Efficiency | Mar 3, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| Reliable validation of Reinforcement Learning Benchmarks | Mar 2, 2022 | BenchmarkingData Compression | —Unverified | 0 |
| A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems | Mar 2, 2022 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity | Feb 28, 2022 | Offline RLQ-Learning | —Unverified | 0 |
| All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL | Feb 24, 2022 | AllImitation Learning | CodeCode Available | 1 |
| Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning | Feb 23, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning | Feb 17, 2022 | Deep Reinforcement LearningOffline RL | CodeCode Available | 2 |
| cosFormer: Rethinking Softmax in Attention | Feb 17, 2022 | D4RLLanguage Modeling | CodeCode Available | 1 |
| Supported Policy Optimization for Offline Reinforcement Learning | Feb 13, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Flowformer: Linearizing Transformers with Conservation Flows | Feb 13, 2022 | D4RLOffline RL | CodeCode Available | 2 |
| Settling the Communication Complexity for Distributed Offline Reinforcement Learning | Feb 10, 2022 | Multi-Armed BanditsOffline RL | —Unverified | 0 |
| Transferred Q-learning | Feb 9, 2022 | Offline RLQ-Learning | —Unverified | 0 |
| Offline Reinforcement Learning with Realizability and Single-policy Concentrability | Feb 9, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL | Feb 9, 2022 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Adversarially Trained Actor Critic for Offline Reinforcement Learning | Feb 5, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| How to Leverage Unlabeled Data in Offline Reinforcement Learning | Feb 3, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning | Jan 31, 2022 | DiversityOffline RL | CodeCode Available | 1 |