SOTAVerified

Offline RL

Papers

Showing 551600 of 755 papers

TitleStatusHype
On the Role of Discount Factor in Offline Reinforcement Learning0
RORL: Robust Offline Reinforcement Learning via Conservative SmoothingCode1
Offline RL for Natural Language Generation with Implicit Language Q LearningCode2
Offline Reinforcement Learning with Causal Structured World Models0
Offline Reinforcement Learning with Differential Privacy0
Model Generation with Provable Coverability for Offline Reinforcement Learning0
Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL0
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game0
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments0
Multi-Game Decision TransformersCode0
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters0
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes0
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement LearningCode1
User-Interactive Offline Reinforcement Learning0
How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation0
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning0
Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers0
RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement LearningCode1
Learning Value Functions from Undirected State-only Experience0
COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction EstimationCode1
CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement LearningCode2
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?0
Settling the Sample Complexity of Model-Based Offline Reinforcement Learning0
Offline Reinforcement Learning for Safer Blood Glucose Control in People with Type 1 DiabetesCode1
CIRS: Bursting Filter Bubbles by Counterfactual Interactive Recommender SystemCode1
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps0
A Conservative Q-Learning approach for handling distribution shift in sepsis treatment strategies0
Bellman Residual Orthogonalization for Offline Reinforcement Learning0
Optimizing Trajectories for Highway Driving with Offline Reinforcement Learning0
Semi-Markov Offline Reinforcement Learning for HealthcareCode0
COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning AttacksCode0
Latent-Variable Advantage-Weighted Policy Optimization for Offline RLCode1
DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning0
On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical EfficiencyCode0
Reliable validation of Reinforcement Learning Benchmarks0
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open ProblemsCode0
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity0
All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RLCode1
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement LearningCode1
VRL3: A Data-Driven Framework for Visual Deep Reinforcement LearningCode2
cosFormer: Rethinking Softmax in AttentionCode1
Supported Policy Optimization for Offline Reinforcement LearningCode1
Flowformer: Linearizing Transformers with Conservation FlowsCode2
Settling the Communication Complexity for Distributed Offline Reinforcement Learning0
Transferred Q-learning0
Offline Reinforcement Learning with Realizability and Single-policy Concentrability0
Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RLCode1
Adversarially Trained Actor Critic for Offline Reinforcement LearningCode1
How to Leverage Unlabeled Data in Offline Reinforcement Learning0
Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement LearningCode1
Show:102550
← PrevPage 12 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified