SOTAVerified

Offline RL

Papers

Showing 551575 of 755 papers

TitleStatusHype
On the Role of Discount Factor in Offline Reinforcement Learning0
RORL: Robust Offline Reinforcement Learning via Conservative SmoothingCode1
Offline RL for Natural Language Generation with Implicit Language Q LearningCode2
Offline Reinforcement Learning with Causal Structured World Models0
Offline Reinforcement Learning with Differential Privacy0
Model Generation with Provable Coverability for Offline Reinforcement Learning0
Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL0
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game0
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments0
Multi-Game Decision TransformersCode0
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters0
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes0
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement LearningCode1
User-Interactive Offline Reinforcement Learning0
How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation0
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning0
Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers0
RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement LearningCode1
Learning Value Functions from Undirected State-only Experience0
COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction EstimationCode1
CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement LearningCode2
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?0
Settling the Sample Complexity of Model-Based Offline Reinforcement Learning0
Offline Reinforcement Learning for Safer Blood Glucose Control in People with Type 1 DiabetesCode1
CIRS: Bursting Filter Bubbles by Counterfactual Interactive Recommender SystemCode1
Show:102550
← PrevPage 23 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified