SOTAVerified

Offline RL

Papers

Showing 626650 of 755 papers

TitleStatusHype
Value Penalized Q-Learning for Recommender Systems0
Safe Driving via Expert Guided Policy OptimizationCode1
Planning from Pixels in Environments with Combinatorially Hard Search SpacesCode1
Offline Reinforcement Learning with Implicit Q-LearningCode1
Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse ShapesCode1
StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement LearningCode1
Representation Learning for Online and Offline RL in Low-rank MDPs0
Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters0
Offline RL With Resource Constrained Online DeploymentCode0
You Only Evaluate Once: a Simple Baseline Algorithm for Offline RL0
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-EnsembleCode1
BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement LearningCode0
Offline Reinforcement Learning with Reverse Model-based ImaginationCode1
Reward Shifting for Optimistic Exploration and Conservative Exploitation0
Particle Based Stochastic Policy Optimization0
Variational oracle guiding for reinforcement learning0
Uncertainty Regularized Policy Learning for Offline Reinforcement Learning0
Semi-supervised Offline Reinforcement Learning with Pre-trained Decision Transformers0
Offline Reinforcement Learning with Resource Constrained Online Deployment0
Offline Reinforcement Learning with In-sample Q-LearningCode1
Adaptive Q-learning for Interaction-Limited Reinforcement Learning0
Offline Reinforcement Learning for Large Scale Language Action Spaces0
Pareto Policy Pool for Model-based Offline Reinforcement Learning0
Learning Pseudometric-based Action Representations for Offline Reinforcement Learning0
Data Sharing without Rewards in Multi-Task Offline Reinforcement Learning0
Show:102550
← PrevPage 26 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified