| Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning | May 27, 2024 | Gym halfcheetah-mediumGym halfcheetah-medium-expert | CodeCode Available | 2 | 5 |
| Dungeons and Data: A Large-Scale NetHack Dataset | Nov 1, 2022 | Decision MakingNetHack | CodeCode Available | 2 | 5 |
| Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization | May 25, 2024 | continuous-controlContinuous Control | CodeCode Available | 2 | 5 |
| FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation | May 22, 2023 | Imitation LearningMotion Planning | CodeCode Available | 2 | 5 |
| Flowformer: Linearizing Transformers with Conservation Flows | Feb 13, 2022 | D4RLOffline RL | CodeCode Available | 2 | 5 |
| LongReward: Improving Long-context Large Language Models with AI Feedback | Oct 28, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations | Jun 9, 2022 | Benchmarkingcontinuous-control | CodeCode Available | 2 | 5 |
| Pretrained LLM Adapted with LoRA as a Decision Transformer for Offline RL in Quantitative Trading | Nov 26, 2024 | Offline RLparameter-efficient fine-tuning | CodeCode Available | 2 | 5 |
| CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning | Apr 18, 2022 | ChatbotOffline RL | CodeCode Available | 2 | 5 |
| Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning | Jun 17, 2022 | Few-Shot LearningOffline RL | CodeCode Available | 2 | 5 |