| Iteratively Refined Behavior Regularization for Offline Reinforcement Learning | Jun 9, 2023 | D4RLOffline RL | —Unverified | 0 |
| Instructed Diffuser with Temporal Condition Guidance for Offline Reinforcement Learning | Jun 8, 2023 | Decision MakingOffline RL | —Unverified | 0 |
| Mildly Constrained Evaluation Policy for Offline Reinforcement Learning | Jun 6, 2023 | D4RLMuJoCo | CodeCode Available | 0 |
| PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation | Jun 6, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| State Regularized Policy Optimization on Data with Dynamics Shift | Jun 6, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Survival Instinct in Offline Reinforcement Learning | Jun 5, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning | Jun 1, 2023 | FairnessOffline RL | —Unverified | 0 |
| Improving Offline RL by Blending Heuristics | Jun 1, 2023 | D4RLOffline RL | —Unverified | 0 |
| Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding | Jun 1, 2023 | ManagementOffline RL | —Unverified | 0 |
| IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control | Jun 1, 2023 | D4RLModel-based Reinforcement Learning | —Unverified | 0 |
| What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL? | May 30, 2023 | Imitation LearningOffline RL | CodeCode Available | 0 |
| Robust Reinforcement Learning Objectives for Sequential Recommender Systems | May 30, 2023 | Offline RLRecommendation Systems | CodeCode Available | 0 |
| Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism | May 29, 2023 | Decision MakingEconometrics | —Unverified | 0 |
| Beyond Reward: Offline Preference-guided Policy Optimization | May 25, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning | May 25, 2023 | Distributional Reinforcement LearningOffline RL | CodeCode Available | 0 |
| Offline Primal-Dual Reinforcement Learning for Linear MDPs | May 22, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Offline Reinforcement Learning with Additional Covering Distributions | May 22, 2023 | Inductive BiasOffline RL | —Unverified | 0 |
| Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models | May 18, 2023 | MuJoCoOffline RL | —Unverified | 0 |
| SLiC-HF: Sequence Likelihood Calibration with Human Feedback | May 17, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning | May 17, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage | May 16, 2023 | Offline RL | —Unverified | 0 |
| Towards Generalizable Reinforcement Learning for Trade Execution | May 12, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Explaining RL Decisions with Trajectories | May 6, 2023 | Attributecontinuous-control | CodeCode Available | 0 |
| What can online reinforcement learning with function approximation benefit from general coverage conditions? | Apr 25, 2023 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Using Offline Data to Speed Up Reinforcement Learning in Procedurally Generated Environments | Apr 18, 2023 | Imitation LearningOffline RL | CodeCode Available | 0 |
| Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning | Apr 14, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Uncertainty-driven Trajectory Truncation for Data Augmentation in Offline Reinforcement Learning | Apr 10, 2023 | D4RLData Augmentation | CodeCode Available | 0 |
| Unified Emulation-Simulation Training Environment for Autonomous Cyber Agents | Apr 3, 2023 | Deep Reinforcement LearningOffline RL | —Unverified | 0 |
| Enabling A Network AI Gym for Autonomous Cyber Agents | Apr 3, 2023 | Deep Reinforcement LearningOffline RL | —Unverified | 0 |
| Understanding Reinforcement Learning Algorithms: The Progress from Basic Q-learning to Proximal Policy Optimization | Mar 31, 2023 | Offline RLQ-Learning | —Unverified | 0 |
| MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations | Mar 30, 2023 | Decision MakingImitation Learning | CodeCode Available | 0 |
| Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions | Mar 30, 2023 | DiversityOffline RL | —Unverified | 0 |
| Deep RL with Hierarchical Action Exploration for Dialogue Generation | Mar 22, 2023 | Dialogue GenerationOffline RL | —Unverified | 0 |
| Adaptive Policy Learning for Offline-to-Online Reinforcement Learning | Mar 14, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| Deploying Offline Reinforcement Learning with Human Feedback | Mar 13, 2023 | Decision MakingModel Selection | —Unverified | 0 |
| Graph Decision Transformer | Mar 7, 2023 | Offline RLOpenAI Gym | —Unverified | 0 |
| Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning | Mar 7, 2023 | Continuous ControlOffline RL | —Unverified | 0 |
| On the Sample Complexity of Vanilla Model-Based Offline Reinforcement Learning with Dependent Samples | Mar 7, 2023 | Offline RLOff-policy evaluation | —Unverified | 0 |
| Learning to Influence Human Behavior with Offline Reinforcement Learning | Mar 3, 2023 | Autonomous DrivingOffline RL | —Unverified | 0 |
| Decision Transformer under Random Frame Dropping | Mar 3, 2023 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Learning to Control Autonomous Fleets from Observation via Offline Reinforcement Learning | Feb 28, 2023 | Offline RLreinforcement-learning | CodeCode Available | 0 |
| The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning | Feb 27, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation | Feb 25, 2023 | Offline RLQ-Learning | —Unverified | 0 |
| VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation | Feb 24, 2023 | Computational EfficiencyOffline RL | CodeCode Available | 0 |
| Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications | Feb 15, 2023 | Decision MakingManagement | —Unverified | 0 |
| Language Decision Transformers with Exponential Tilt for Interactive Text Environments | Feb 10, 2023 | Offline RL | —Unverified | 0 |
| A Strong Baseline for Batch Imitation Learning | Feb 6, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage | Feb 5, 2023 | Offline RLQ-Learning | —Unverified | 0 |
| Selective Uncertainty Propagation in Offline RL | Feb 1, 2023 | Offline RLreinforcement-learning | —Unverified | 0 |
| Revisiting Bellman Errors for Offline Model Selection | Jan 31, 2023 | Atari Gamesmodel | CodeCode Available | 0 |