SOTAVerified

Offline RL

Papers

Showing 351375 of 755 papers

TitleStatusHype
Deploying Offline Reinforcement Learning with Human Feedback0
Measurement Scheduling for ICU Patients with Offline Reinforcement Learning0
Large-Scale Retrieval for Reinforcement Learning0
Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding0
Bi-Level Offline Policy Optimization with Limited Exploration0
Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning0
Offline Actor-Critic Reinforcement Learning Scales to Large Models0
Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies0
Offline Policy Optimization with Variance Regularization0
Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning0
Large Language Model driven Policy Exploration for Recommender Systems0
A Dual Approach to Imitation Learning from Observations with Offline Datasets0
Language-Conditioned Offline RL for Multi-Robot Navigation0
Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation0
DeepThermal: Combustion Optimization for Thermal Power Generating Units Using Offline Reinforcement Learning0
Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics0
Model-enhanced Contrastive Reinforcement Learning for Sequential Recommendation0
Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning0
Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL0
Deep RL with Hierarchical Action Exploration for Dialogue Generation0
KAN v.s. MLP for Offline Reinforcement Learning0
MOORL: A Framework for Integrating Offline-Online Reinforcement Learning0
Is Pessimism Provably Efficient for Offline RL?0
Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications0
Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets0
Show:102550
← PrevPage 15 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified