SOTAVerified

Offline RL

Papers

Showing 626650 of 755 papers

TitleStatusHype
PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement0
Prompting Decision Transformer for Few-Shot Policy Generalization0
Provable Benefit of Multitask Representation Learning in Reinforcement Learning0
What can online reinforcement learning with function approximation benefit from general coverage conditions?0
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation0
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward0
Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources0
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL0
Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care0
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL0
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning0
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions0
Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning0
Real Robot Challenge 2022: Learning Dexterous Manipulation from Offline Data in the Real World0
The Smart Buildings Control Suite: A Diverse Open Source Benchmark to Evaluate and Scale HVAC Control Policies for Sustainability0
Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning0
Real-World Offline Reinforcement Learning from Vision Language Model Feedback0
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage0
Regularized Behavior Value Estimation0
Reinforced Self-Training (ReST) for Language Modeling0
Reinforcement Learning: An Overview0
Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action Modeling0
Reinforcement Learning for Individual Optimal Policy from Heterogeneous Data0
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism0
Reliable validation of Reinforcement Learning Benchmarks0
Show:102550
← PrevPage 26 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified