SOTAVerified

Offline RL

Papers

Showing 526550 of 755 papers

TitleStatusHype
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement LearningCode2
Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity0
AdaCat: Adaptive Categorical Discretization for Autoregressive ModelsCode1
Offline Reinforcement Learning at Multiple Frequencies0
Discriminator-Weighted Offline Imitation Learning from Suboptimal DemonstrationsCode1
BCRLSP: An Offline Reinforcement Learning Framework for Sequential Targeted Promotion0
GriddlyJS: A Web IDE for Reinforcement Learning0
Offline Equilibrium FindingCode0
Offline RL Policies Should be Trained to be Adaptive0
An Empirical Study of Implicit Regularization in Deep Offline RL0
Prompting Decision Transformer for Few-Shot Policy Generalization0
When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement LearningCode1
Behavior Transformers: Cloning k modes with one stoneCode1
A Survey on Model-based Reinforcement Learning0
Bootstrapped Transformer for Offline Reinforcement Learning0
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement LearningCode2
Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based ImaginationCode0
Contrastive Learning as Goal-Conditioned Reinforcement Learning0
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement LearningCode0
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward0
Provable Benefit of Multitask Representation Learning in Reinforcement Learning0
Federated Offline Reinforcement Learning0
Large-Scale Retrieval for Reinforcement Learning0
Challenges and Opportunities in Offline Reinforcement Learning from Visual ObservationsCode2
Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement LearningCode1
Show:102550
← PrevPage 22 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified