| Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning | Jun 9, 2022 | D4RLModel-based Reinforcement Learning | CodeCode Available | 1 |
| Mildly Conservative Q-Learning for Offline Reinforcement Learning | Jun 9, 2022 | D4RLQ-Learning | CodeCode Available | 1 |
| On the Role of Discount Factor in Offline Reinforcement Learning | Jun 7, 2022 | D4RLOffline RL | —Unverified | 0 |
| When does return-conditioned supervised learning work for offline reinforcement learning? | Jun 2, 2022 | D4RLreinforcement-learning | CodeCode Available | 1 |
| Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL | Jun 1, 2022 | D4RLOffline RL | —Unverified | 0 |
| Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters | May 27, 2022 | D4RLOffline RL | —Unverified | 0 |
| When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning | May 23, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning | Feb 23, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| A Behavior Regularized Implicit Policy for Offline Reinforcement Learning | Feb 19, 2022 | D4RLreinforcement-learning | —Unverified | 0 |
| cosFormer: Rethinking Softmax in Attention | Feb 17, 2022 | D4RLLanguage Modeling | CodeCode Available | 1 |