SOTAVerified

Offline RL

Papers

Showing 301350 of 755 papers

TitleStatusHype
Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare0
The Smart Buildings Control Suite: A Diverse Open Source Benchmark to Evaluate and Scale HVAC Control Policies for Sustainability0
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization0
OffRIPP: Offline RL-based Informative Path Planning0
Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm0
KAN v.s. MLP for Offline Reinforcement Learning0
Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning0
Enhancing Cross-domain Pre-Trained Decision Transformers with Adaptive Attention0
The Role of Deep Learning Regularizations on Actors in Offline RLCode0
Tractable Offline Learning of Regular Decision Processes0
Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning0
Unsupervised-to-Online Reinforcement Learning0
Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning0
SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning0
Domain Adaptation for Offline Reinforcement Learning with Limited Samples0
Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement LearningCode0
Preference-Guided Reflective Sampling for Aligning Language ModelsCode0
Offline Model-Based Reinforcement Learning with Anti-Exploration0
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba0
Enhancing Reinforcement Learning Through Guided Search0
Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds0
Experimental evaluation of offline reinforcement learning for HVAC control in buildingsCode0
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning0
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs0
Consistent time travel for realistic interactions with historical data: reinforcement learning for market making0
Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning0
Language-Conditioned Offline RL for Multi-Robot Navigation0
Diffusion Models as Optimizers for Efficient Planning in Offline RLCode0
ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender SystemsCode0
Sparsity-based Safety Conservatism for Constrained Offline Reinforcement Learning0
BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning0
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning0
FOSP: Fine-tuning Offline Safe Policy through World Models0
Robust Decision Transformer: Tackling Data Corruption in Offline RL via Sequence Modeling0
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning0
Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning0
Benchmarks for Reinforcement Learning with Biased Offline Data and Imperfect Simulators0
Preference Elicitation for Offline Reinforcement Learning0
Equivariant Offline Reinforcement Learning0
Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing0
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback0
The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation0
Binary Reward Labeling: Bridging Offline Preference and Reward-Based Reinforcement Learning0
SeMOPO: Learning High-quality Model and Policy from Low-quality Offline Visual Datasets0
A Dual Approach to Imitation Learning from Observations with Offline Datasets0
DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning0
Augmenting Offline RL with Unlabeled Data0
CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning0
Integrating Domain Knowledge for handling Limited Data in Offline RL0
Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning0
Show:102550
← PrevPage 7 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified