| Language-Conditioned Offline RL for Multi-Robot Navigation | Jul 29, 2024 | Offline RLRobot Navigation | —Unverified | 0 |
| A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data | Jul 23, 2024 | Autonomous DrivingAutonomous Racing | CodeCode Available | 2 |
| Diffusion Models as Optimizers for Efficient Planning in Offline RL | Jul 23, 2024 | D4RLDecision Making | CodeCode Available | 0 |
| ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems | Jul 18, 2024 | Offline RLRecommendation Systems | CodeCode Available | 0 |
| Sparsity-based Safety Conservatism for Constrained Offline Reinforcement Learning | Jul 17, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning | Jul 15, 2024 | Model-based Reinforcement LearningOffline RL | —Unverified | 0 |
| Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning | Jul 10, 2024 | Decision MakingOffline RL | —Unverified | 0 |
| FOSP: Fine-tuning Offline Safe Policy through World Models | Jul 6, 2024 | Model-based Reinforcement LearningOffline RL | —Unverified | 0 |
| Robust Decision Transformer: Tackling Data Corruption in Offline RL via Sequence Modeling | Jul 5, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning | Jul 1, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Benchmarks for Reinforcement Learning with Biased Offline Data and Imperfect Simulators | Jun 30, 2024 | Autonomous VehiclesOffline RL | —Unverified | 0 |
| Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning | Jun 30, 2024 | D4RLOffline RL | —Unverified | 0 |
| Preference Elicitation for Offline Reinforcement Learning | Jun 26, 2024 | Offline RLreinforcement-learning | —Unverified | 0 |
| Equivariant Offline Reinforcement Learning | Jun 20, 2024 | Offline RLQ-Learning | —Unverified | 0 |
| Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing | Jun 20, 2024 | Autonomous DrivingData Augmentation | —Unverified | 0 |
| Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback | Jun 18, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation | Jun 17, 2024 | Offline RL | —Unverified | 0 |
| DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning | Jun 14, 2024 | Offline RL | CodeCode Available | 3 |
| Binary Reward Labeling: Bridging Offline Preference and Reward-Based Reinforcement Learning | Jun 14, 2024 | D4RLOffline RL | —Unverified | 0 |
| SeMOPO: Learning High-quality Model and Policy from Low-quality Offline Visual Datasets | Jun 13, 2024 | D4RLOffline RL | —Unverified | 0 |
| DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning | Jun 13, 2024 | D4RLOffline RL | —Unverified | 0 |
| A Dual Approach to Imitation Learning from Observations with Offline Datasets | Jun 13, 2024 | Imitation LearningOffline RL | —Unverified | 0 |
| Is Value Learning Really the Main Bottleneck in Offline RL? | Jun 13, 2024 | Imitation LearningOffline RL | CodeCode Available | 3 |
| Augmenting Offline RL with Unlabeled Data | Jun 11, 2024 | Offline RLTransfer Learning | —Unverified | 0 |
| CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning | Jun 11, 2024 | D4RLDenoising | —Unverified | 0 |
| Integrating Domain Knowledge for handling Limited Data in Offline RL | Jun 11, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer | Jun 10, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning? | Jun 10, 2024 | Deep Reinforcement LearningOffline RL | CodeCode Available | 0 |
| Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning | Jun 10, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL | Jun 8, 2024 | Data AugmentationMamba | CodeCode Available | 0 |
| Stabilizing Extreme Q-learning by Maclaurin Expansion | Jun 7, 2024 | D4RLOffline RL | CodeCode Available | 0 |
| Strategically Conservative Q-Learning | Jun 6, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models | Jun 6, 2024 | Offline RLreinforcement-learning | —Unverified | 0 |
| UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning | Jun 5, 2024 | D4RLOffline RL | —Unverified | 0 |
| A Fast Convergence Theory for Offline Decision Making | Jun 3, 2024 | Decision MakingOffline RL | —Unverified | 0 |
| Causal prompting model-based offline reinforcement learning | Jun 3, 2024 | modelOffline RL | —Unverified | 0 |
| Diffusion Policies creating a Trust Region for Offline Reinforcement Learning | May 30, 2024 | D4RLDenoising | CodeCode Available | 1 |
| Inverse Concave-Utility Reinforcement Learning is Inverse Game Theory | May 29, 2024 | Imitation LearningOffline RL | —Unverified | 0 |
| Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning | May 29, 2024 | Offline RLreinforcement-learning | —Unverified | 0 |
| Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination | May 28, 2024 | Offline RLreinforcement-learning | CodeCode Available | 1 |
| Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL | May 28, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 1 |
| AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization | May 28, 2024 | D4RLOffline RL | CodeCode Available | 0 |
| Unified Preference Optimization: Language Model Alignment Beyond the Preference Frontier | May 28, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators | May 27, 2024 | Decision MakingOffline RL | —Unverified | 0 |
| Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear q^π-Realizability and Concentrability | May 27, 2024 | Computational EfficiencyOffline RL | —Unverified | 0 |
| Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning | May 27, 2024 | Gym halfcheetah-mediumGym halfcheetah-medium-expert | CodeCode Available | 2 |
| Q-value Regularized Transformer for Offline Reinforcement Learning | May 27, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning | May 27, 2024 | Data AugmentationDecision Making | CodeCode Available | 1 |
| Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization | May 25, 2024 | continuous-controlContinuous Control | CodeCode Available | 2 |
| Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search | May 24, 2024 | Code GenerationLanguage Modelling | CodeCode Available | 1 |