| Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation | May 31, 2023 | D4RLLanguage Modelling | CodeCode Available | 1 |
| What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL? | May 30, 2023 | Imitation LearningOffline RL | CodeCode Available | 0 |
| Robust Reinforcement Learning Objectives for Sequential Recommender Systems | May 30, 2023 | Offline RLRecommendation Systems | CodeCode Available | 0 |
| Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism | May 29, 2023 | Decision MakingEconometrics | —Unverified | 0 |
| MADiff: Offline Multi-agent Learning with Diffusion Models | May 27, 2023 | Offline RLQ-Learning | CodeCode Available | 1 |
| The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning | May 25, 2023 | Distributional Reinforcement LearningOffline RL | CodeCode Available | 0 |
| Beyond Reward: Offline Preference-guided Policy Optimization | May 25, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning | May 24, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models | May 24, 2023 | Language ModellingOffline RL | CodeCode Available | 1 |
| When should we prefer Decision Transformers for Offline Reinforcement Learning? | May 23, 2023 | D4RLImitation Learning | CodeCode Available | 1 |