| Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search | May 24, 2024 | Code GenerationLanguage Modelling | CodeCode Available | 1 | 5 |
| Latent-Variable Advantage-Weighted Policy Optimization for Offline RL | Mar 16, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| cosFormer: Rethinking Softmax in Attention | Feb 17, 2022 | D4RLLanguage Modeling | CodeCode Available | 1 | 5 |
| Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials | Oct 11, 2022 | Offline RLQ-Learning | CodeCode Available | 1 | 5 |
| When should we prefer Decision Transformers for Offline Reinforcement Learning? | May 23, 2023 | D4RLImitation Learning | CodeCode Available | 1 | 5 |
| Are Expressive Models Truly Necessary for Offline RL? | Dec 15, 2024 | D4RLOffline RL | CodeCode Available | 1 | 5 |
| Direct Preference-based Policy Optimization without Reward Modeling | Jan 30, 2023 | Contrastive LearningOffline RL | CodeCode Available | 1 | 5 |
| Diffusion Policies creating a Trust Region for Offline Reinforcement Learning | May 30, 2024 | D4RLDenoising | CodeCode Available | 1 | 5 |
| COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation | Apr 19, 2022 | Offline RLOff-policy evaluation | CodeCode Available | 1 | 5 |
| Efficient Diffusion Policies for Offline Reinforcement Learning | May 31, 2023 | D4RLOffline RL | CodeCode Available | 1 | 5 |