| Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling | Sep 29, 2022 | Computational EfficiencyD4RL | CodeCode Available | 1 |
| Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning | Jun 9, 2022 | D4RLModel-based Reinforcement Learning | CodeCode Available | 1 |
| Mildly Conservative Q-Learning for Offline Reinforcement Learning | Jun 9, 2022 | D4RLQ-Learning | CodeCode Available | 1 |
| When does return-conditioned supervised learning work for offline reinforcement learning? | Jun 2, 2022 | D4RLreinforcement-learning | CodeCode Available | 1 |
| When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning | May 23, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning | Feb 23, 2022 | D4RLOffline RL | CodeCode Available | 1 |
| cosFormer: Rethinking Softmax in Attention | Feb 17, 2022 | D4RLLanguage Modeling | CodeCode Available | 1 |
| Adversarially Trained Actor Critic for Offline Reinforcement Learning | Feb 5, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| False Correlation Reduction for Offline Reinforcement Learning | Oct 24, 2021 | D4RLDecision Making | CodeCode Available | 1 |
| Offline Reinforcement Learning with Value-based Episodic Memory | Oct 19, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| Offline Reinforcement Learning with Implicit Q-Learning | Oct 12, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble | Oct 4, 2021 | Adroid door-clonedAdroid door-human | CodeCode Available | 1 |
| Offline Reinforcement Learning with In-sample Q-Learning | Sep 29, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| Implicit Behavioral Cloning | Sep 1, 2021 | D4RL | CodeCode Available | 1 |
| Conservative Offline Distributional Reinforcement Learning | Jul 12, 2021 | D4RLDistributional Reinforcement Learning | CodeCode Available | 1 |
| Offline RL Without Off-Policy Evaluation | Jun 16, 2021 | D4RLOffline RL | CodeCode Available | 1 |
| Decision Transformer: Reinforcement Learning via Sequence Modeling | Jun 2, 2021 | Atari GamesD4RL | CodeCode Available | 1 |
| Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention | Jun 29, 2020 | D4RLLanguage Modelling | CodeCode Available | 1 |
| From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning | Jul 17, 2025 | D4RLOffline RL | —Unverified | 0 |
| Accelerating Residual Reinforcement Learning with Uncertainty Estimation | Jun 21, 2025 | D4RLreinforcement-learning | —Unverified | 0 |
| CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization | Jun 18, 2025 | D4RLOffline RL | CodeCode Available | 0 |
| MOORL: A Framework for Integrating Offline-Online Reinforcement Learning | Jun 11, 2025 | D4RLDeep Reinforcement Learning | —Unverified | 0 |
| Policy-Based Trajectory Clustering in Offline Reinforcement Learning | Jun 10, 2025 | ClusteringD4RL | —Unverified | 0 |
| Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood | Jun 10, 2025 | Computational EfficiencyD4RL | CodeCode Available | 0 |
| STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation | May 27, 2025 | D4RLDenoising | —Unverified | 0 |