| PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement | Nov 26, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Prompting Decision Transformer for Few-Shot Policy Generalization | Jun 27, 2022 | Few-Shot LearningInductive Bias | —Unverified | 0 |
| Provable Benefit of Multitask Representation Learning in Reinforcement Learning | Jun 13, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| What can online reinforcement learning with function approximation benefit from general coverage conditions? | Apr 25, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation | Feb 25, 2023 | Offline RLQ-Learning | —Unverified | 0 |
| Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward | Jun 13, 2022 | Offline RLreinforcement-learning | —Unverified | 0 |
| Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources | Jun 14, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL | Jun 22, 2021 | Deep Reinforcement LearningOffline RL | —Unverified | 0 |
| Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care | Jun 13, 2023 | Offline RLQ-Learning | —Unverified | 0 |
| Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL | Sep 8, 2022 | D4RLOffline RL | —Unverified | 0 |
| Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning | Nov 7, 2024 | Offline RLPolicy Gradient Methods | —Unverified | 0 |
| Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions | Sep 18, 2023 | Imitation LearningOffline RL | —Unverified | 0 |
| Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning | Sep 12, 2024 | D4RLOffline RL | —Unverified | 0 |
| Real Robot Challenge 2022: Learning Dexterous Manipulation from Offline Data in the Real World | Aug 15, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| The Smart Buildings Control Suite: A Diverse Open Source Benchmark to Evaluate and Scale HVAC Control Policies for Sustainability | Oct 2, 2024 | Model Predictive ControlOffline RL | —Unverified | 0 |
| Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning | Feb 8, 2024 | Deep Reinforcement LearningOffline RL | —Unverified | 0 |
| Real-World Offline Reinforcement Learning from Vision Language Model Feedback | Nov 8, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage | Feb 5, 2023 | Offline RLQ-Learning | —Unverified | 0 |
| Regularized Behavior Value Estimation | Mar 17, 2021 | Offline RL | —Unverified | 0 |
| Reinforced Self-Training (ReST) for Language Modeling | Aug 17, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Reinforcement Learning: An Overview | Dec 6, 2024 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action Modeling | Mar 25, 2024 | Offline RLRecommendation Systems | —Unverified | 0 |
| Reinforcement Learning for Individual Optimal Policy from Heterogeneous Data | May 14, 2025 | Offline RLreinforcement-learning | —Unverified | 0 |
| Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism | May 29, 2023 | Decision MakingEconometrics | —Unverified | 0 |
| Reliable validation of Reinforcement Learning Benchmarks | Mar 2, 2022 | BenchmarkingData Compression | —Unverified | 0 |