SOTAVerified

Offline RL

Papers

Showing 451500 of 755 papers

TitleStatusHype
Skill Decision TransformerCode0
Direct Preference-based Policy Optimization without Reward ModelingCode1
Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation PoliciesCode0
Guiding Online Reinforcement Learning with Action-Free Offline PretrainingCode1
Learning to View: Decision Transformers for Active Object Detection0
Extreme Q-Learning: MaxEnt RL without EntropyCode1
Offline Evaluation for Reinforcement Learning-based Recommendation: A Critical Issue and Some Alternatives0
Benchmarks and Algorithms for Offline Preference-Based Reward Learning0
Offline Policy Optimization in RL with Variance Regularizaton0
Representation Learning in Deep RL via Discrete Information Bottleneck0
Offline Reinforcement Learning via Linear-Programming with Error-Bound Induced Constraints0
Offline Reinforcement Learning for Visual NavigationCode1
Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies0
Confidence-Conditioned Value Functions for Offline Reinforcement Learning0
Benchmarking Offline Reinforcement Learning Algorithms for E-Commerce Order Fraud Evaluation0
TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed DatasetsCode0
Launchpad: Learning to Schedule Using Offline and Online RL Methods0
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement LearningCode1
Efficient Reinforcement Learning Through Trajectory GenerationCode1
Behavior Estimation from Multi-Source Data for Offline Reinforcement LearningCode0
Offline Policy Evaluation and Optimization under Confounding0
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators0
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes0
Is Conditional Generative Modeling all you need for Decision-Making?0
State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning0
Domain Generalization for Robust Model-Based Offline Reinforcement Learning0
Masked Autoencoding for Scalable and Generalizable Decision MakingCode1
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation0
A Low Latency Adaptive Coding Spiking Framework for Deep Reinforcement LearningCode0
Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing FlowsCode1
Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch SizeCode1
Contextual Transformer for Offline Meta Reinforcement Learning0
Offline Reinforcement Learning with Adaptive Behavior Regularization0
Leveraging Offline Data in Online Reinforcement Learning0
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data0
Wall Street Tree Search: Risk-Aware Planning for Offline Reinforcement Learning0
Contrastive Value Learning: Implicit Models for Simple Offline RL0
Oracle Inequalities for Model Selection in Offline Reinforcement Learning0
Dual Generator Offline Reinforcement Learning0
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints0
Behavior Prior Representation learning for Offline Reinforcement LearningCode0
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian0
Dungeons and Data: A Large-Scale NetHack DatasetCode2
Agent-Controller Representations: Principled Offline RL with Rich Exogenous InformationCode1
Leveraging Demonstrations with Latent Space PriorsCode1
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement LearningCode1
Implicit Offline Reinforcement Learning via Supervised Learning0
The Pump Scheduling Problem: A Real-World Scenario for Reinforcement LearningCode0
MoCoDA: Model-based Counterfactual Data AugmentationCode1
Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation0
Show:102550
← PrevPage 10 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified