| In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning | Dec 12, 2024 | Offline RL | CodeCode Available | 1 | 5 |
| Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models | May 24, 2023 | Language ModellingOffline RL | CodeCode Available | 1 | 5 |
| All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL | Feb 24, 2022 | AllImitation Learning | CodeCode Available | 1 | 5 |
| Latent-Variable Advantage-Weighted Policy Optimization for Offline RL | Mar 16, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Behavior Transformers: Cloning k modes with one stone | Jun 22, 2022 | Object DetectionOffline RL | CodeCode Available | 1 | 5 |
| Offline Reinforcement Learning with Reverse Model-based Imagination | Oct 1, 2021 | Data Augmentationmodel | CodeCode Available | 1 | 5 |
| Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning | Oct 25, 2022 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Critic Regularized Regression | Jun 26, 2020 | Offline RLregression | CodeCode Available | 1 | 5 |
| Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows | Nov 20, 2022 | Offline RLreinforcement-learning | CodeCode Available | 1 | 5 |
| Behavior Proximal Policy Optimization | Feb 22, 2023 | D4RLOffline RL | CodeCode Available | 1 | 5 |