| Understanding Reinforcement Learning Algorithms: The Progress from Basic Q-learning to Proximal Policy Optimization | Mar 31, 2023 | Offline RLQ-Learning | —Unverified | 0 |
| MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations | Mar 30, 2023 | Decision MakingImitation Learning | CodeCode Available | 0 |
| Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions | Mar 30, 2023 | DiversityOffline RL | —Unverified | 0 |
| Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization | Mar 28, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Optimal Transport for Offline Imitation Learning | Mar 24, 2023 | D4RLDecision Making | CodeCode Available | 1 |
| Deep RL with Hierarchical Action Exploration for Dialogue Generation | Mar 22, 2023 | Dialogue GenerationOffline RL | —Unverified | 0 |
| DataLight: Offline Data-Driven Traffic Signal Control | Mar 20, 2023 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| Adaptive Policy Learning for Offline-to-Online Reinforcement Learning | Mar 14, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| Deploying Offline Reinforcement Learning with Human Feedback | Mar 13, 2023 | Decision MakingModel Selection | —Unverified | 0 |
| Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning | Mar 9, 2023 | Offline RLQ-Learning | CodeCode Available | 1 |