SOTAVerified

Offline RL

Papers

Showing 651700 of 755 papers

TitleStatusHype
CrowdPlay: Crowdsourcing human demonstration data for offline learning in Atari games0
Should I Run Offline Reinforcement Learning or Behavioral Cloning?0
Targeted Environment Design from Offline Data0
The Essential Elements of Offline RL via Supervised Learning0
Why so pessimistic? Estimating uncertainties for offline RL through ensembles, and why their independence matters.0
A Workflow for Offline Model-Free Robotic Reinforcement LearningCode1
Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation0
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning0
DCUR: Data Curriculum for Teaching via Samples with Reinforcement LearningCode0
Policy Gradients Incorporating the Future0
Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare SettingsCode1
Offline Preference-Based Apprenticeship Learning0
Constraints Penalized Q-learning for Safe Offline Reinforcement Learning0
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage0
Conservative Offline Distributional Reinforcement LearningCode1
Enhancing Video Analytics Accuracy via Real-time Automated Camera Parameter Tuning0
Offline Meta-Reinforcement Learning with Online Self-SupervisionCode1
The Least Restriction for Offline Reinforcement Learning0
Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement LearningCode0
Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-EnsembleCode1
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL0
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction EstimationCode1
Boosting Offline Reinforcement Learning with Residual Generative Modeling0
Offline RL Without Off-Policy EvaluationCode1
Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL0
On Multi-objective Policy Optimization as a Tool for Reinforcement Learning: Case Studies in Offline RL and Finetuning0
Reinforcement Learning as One Big Sequence Modeling ProblemCode1
A Minimalist Approach to Offline Reinforcement LearningCode1
Corruption-Robust Offline Reinforcement Learning0
Offline Reinforcement Learning as Anti-Exploration0
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning0
Offline Inverse Reinforcement Learning0
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement LearningCode1
Online reinforcement learning with sparse rewards through an active inference capsuleCode1
Offline Reinforcement Learning as One Big Sequence Modeling ProblemCode1
Decision Transformer: Reinforcement Learning via Sequence ModelingCode1
Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Reinforcement Learning0
Revisiting Design Choices in Offline Model Based Reinforcement Learning0
Uncertainty Weighted Actor-Critic for Offline Reinforcement LearningCode1
Model-Based Offline Planning with Trajectory PruningCode0
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings0
Interpretable performance analysis towards offline reinforcement learning: A dataset perspective0
InferNet for Delayed Reinforcement Tasks: Addressing the Temporal Credit Assignment Problem0
Online and Offline Reinforcement Learning by Planning with a Learned ModelCode1
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism0
Regularized Behavior Value Estimation0
Offline Reinforcement Learning with Fisher Divergence Critic Regularization0
Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks0
S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning0
Instabilities of Offline RL with Pre-Trained Neural Representation0
Show:102550
← PrevPage 14 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified