SOTAVerified

Offline RL

Papers

Showing 426450 of 755 papers

TitleStatusHype
Deep RL with Hierarchical Action Exploration for Dialogue Generation0
DataLight: Offline Data-Driven Traffic Signal ControlCode1
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning0
Deploying Offline Reinforcement Learning with Human Feedback0
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-TuningCode1
Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning0
Graph Decision Transformer0
On the Sample Complexity of Vanilla Model-Based Offline Reinforcement Learning with Dependent Samples0
Decision Transformer under Random Frame DroppingCode0
Learning to Influence Human Behavior with Offline Reinforcement Learning0
Learning to Control Autonomous Fleets from Observation via Offline Reinforcement LearningCode0
The In-Sample Softmax for Offline Reinforcement LearningCode1
The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning0
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation0
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function ApproximationCode0
Neural Laplace Control for Continuous-time Delayed SystemsCode1
Behavior Proximal Policy OptimizationCode1
Swapped goal-conditioned offline reinforcement learningCode1
Dual RL: Unification and New Methods for Reinforcement and Imitation LearningCode1
Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications0
Language Decision Transformers with Exponential Tilt for Interactive Text Environments0
A Strong Baseline for Batch Imitation Learning0
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage0
Selective Uncertainty Propagation in Offline RL0
Revisiting Bellman Errors for Offline Model SelectionCode0
Show:102550
← PrevPage 18 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified