| Critic Regularized Regression | Jun 26, 2020 | Offline RLregression | CodeCode Available | 1 |
| Conservative Q-Learning for Offline Reinforcement Learning | Jun 8, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization | Jun 5, 2020 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Acme: A Research Framework for Distributed Reinforcement Learning | Jun 1, 2020 | Deep Reinforcement LearningDQN Replay Dataset | CodeCode Available | 1 |
| MOPO: Model-based Offline Policy Optimization | May 27, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| MOReL : Model-Based Offline Reinforcement Learning | May 12, 2020 | modelOffline RL | CodeCode Available | 1 |
| An Optimistic Perspective on Offline Deep Reinforcement Learning | Jan 1, 2020 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 |
| An Optimistic Perspective on Offline Reinforcement Learning | Jul 10, 2019 | Atari GamesDiversity | CodeCode Available | 1 |
| From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning | Jul 17, 2025 | D4RLOffline RL | —Unverified | 0 |
| Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs | Jul 15, 2025 | DiversityMMLU | CodeCode Available | 0 |