SOTAVerified

Offline RL

Papers

Showing 551600 of 755 papers

TitleStatusHype
Skill Decision TransformerCode0
Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation PoliciesCode0
Learning to View: Decision Transformers for Active Object Detection0
Benchmarks and Algorithms for Offline Preference-Based Reward Learning0
Offline Evaluation for Reinforcement Learning-based Recommendation: A Critical Issue and Some Alternatives0
Offline Policy Optimization in RL with Variance Regularizaton0
Offline Reinforcement Learning via Linear-Programming with Error-Bound Induced Constraints0
Representation Learning in Deep RL via Discrete Information Bottleneck0
Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies0
Confidence-Conditioned Value Functions for Offline Reinforcement Learning0
Benchmarking Offline Reinforcement Learning Algorithms for E-Commerce Order Fraud Evaluation0
TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed DatasetsCode0
Launchpad: Learning to Schedule Using Offline and Online RL Methods0
Offline Policy Evaluation and Optimization under Confounding0
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators0
Behavior Estimation from Multi-Source Data for Offline Reinforcement LearningCode0
Is Conditional Generative Modeling all you need for Decision-Making?0
State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning0
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes0
Domain Generalization for Robust Model-Based Offline Reinforcement Learning0
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation0
A Low Latency Adaptive Coding Spiking Framework for Deep Reinforcement LearningCode0
Offline Reinforcement Learning with Adaptive Behavior Regularization0
Contextual Transformer for Offline Meta Reinforcement Learning0
Leveraging Offline Data in Online Reinforcement Learning0
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data0
Wall Street Tree Search: Risk-Aware Planning for Offline Reinforcement Learning0
Contrastive Value Learning: Implicit Models for Simple Offline RL0
Oracle Inequalities for Model Selection in Offline Reinforcement Learning0
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints0
Behavior Prior Representation learning for Offline Reinforcement LearningCode0
Dual Generator Offline Reinforcement Learning0
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian0
Implicit Offline Reinforcement Learning via Supervised Learning0
The Pump Scheduling Problem: A Real-World Scenario for Reinforcement LearningCode0
Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation0
Boosting Offline Reinforcement Learning via Data Rebalancing0
Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data0
Mutual Information Regularized Offline Reinforcement LearningCode0
Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics BeliefCode0
State Advantage Weighting for Offline RL0
The Role of Coverage in Online Reinforcement Learning0
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient0
S2P: State-conditioned Image Synthesis for Data Augmentation in Offline Reinforcement LearningCode0
Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes0
Can Offline Reinforcement Learning Help Natural Language Understanding?0
Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation0
Task-Agnostic Learning to Accomplish New Tasks0
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL0
Dialogue Evaluation with Offline Reinforcement Learning0
Show:102550
← PrevPage 12 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified