SOTAVerified

Offline RL

Papers

Showing 601650 of 755 papers

TitleStatusHype
Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments0
Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity0
Offline Reinforcement Learning at Multiple Frequencies0
BCRLSP: An Offline Reinforcement Learning Framework for Sequential Targeted Promotion0
GriddlyJS: A Web IDE for Reinforcement Learning0
Offline Equilibrium FindingCode0
Offline RL Policies Should be Trained to be Adaptive0
An Empirical Study of Implicit Regularization in Deep Offline RL0
Prompting Decision Transformer for Few-Shot Policy Generalization0
A Survey on Model-based Reinforcement Learning0
Bootstrapped Transformer for Offline Reinforcement Learning0
Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based ImaginationCode0
Contrastive Learning as Goal-Conditioned Reinforcement Learning0
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement LearningCode0
Provable Benefit of Multitask Representation Learning in Reinforcement Learning0
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward0
Federated Offline Reinforcement Learning0
Large-Scale Retrieval for Reinforcement Learning0
On the Role of Discount Factor in Offline Reinforcement Learning0
Offline Reinforcement Learning with Causal Structured World Models0
Offline Reinforcement Learning with Differential Privacy0
Model Generation with Provable Coverability for Offline Reinforcement Learning0
Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL0
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game0
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments0
Multi-Game Decision TransformersCode0
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters0
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes0
User-Interactive Offline Reinforcement Learning0
How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation0
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning0
Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers0
Learning Value Functions from Undirected State-only Experience0
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?0
Settling the Sample Complexity of Model-Based Offline Reinforcement Learning0
A Conservative Q-Learning approach for handling distribution shift in sepsis treatment strategies0
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps0
Bellman Residual Orthogonalization for Offline Reinforcement Learning0
Optimizing Trajectories for Highway Driving with Offline Reinforcement Learning0
Semi-Markov Offline Reinforcement Learning for HealthcareCode0
COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning AttacksCode0
DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning0
On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical EfficiencyCode0
Reliable validation of Reinforcement Learning Benchmarks0
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open ProblemsCode0
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity0
Settling the Communication Complexity for Distributed Offline Reinforcement Learning0
Offline Reinforcement Learning with Realizability and Single-policy Concentrability0
Transferred Q-learning0
How to Leverage Unlabeled Data in Offline Reinforcement Learning0
Show:102550
← PrevPage 13 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified