SOTAVerified

Offline RL

Papers

Showing 251300 of 755 papers

TitleStatusHype
Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement LearningCode1
The Virtues of Pessimism in Inverse Reinforcement Learning0
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching0
Adaptive Q-Aid for Conditional Supervised Learning in Offline Reinforcement Learning0
ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient UpdateCode1
Context-Former: Stitching via Latent Conditioned Sequence Modeling0
Multi-Object Navigation in real environments using hybrid policies0
Differentiable Tree Search NetworkCode5
Solving Offline Reinforcement Learning with Decision Tree RegressionCode0
MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning0
Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion ModelCode2
Harnessing Density Ratios for Online Reinforcement Learning0
DiffClone: Enhanced Behaviour Cloning in Robotics with Diffusion-Driven Policy LearningCode0
Solving Continual Offline Reinforcement Learning with Decision Transformer0
Learning from Sparse Offline Datasets via Conservative Density EstimationCode0
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization0
SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement LearningCode0
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond0
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning0
Policy-regularized Offline Multi-objective Reinforcement LearningCode0
POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement LearningCode0
Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning0
Online Symbolic Music Alignment with Offline Reinforcement LearningCode1
PDiT: Interleaving Perception and Decision-making Transformers for Deep Reinforcement LearningCode1
Critic-Guided Decision Transformer for Offline Reinforcement LearningCode1
Neural Network Approximation for Pessimistic Offline Reinforcement Learning0
CUDC: A Curiosity-Driven Unsupervised Data Collection Method with Adaptive Temporal Distances for Offline Reinforcement Learning0
Advancing RAN Slicing with Offline Reinforcement Learning0
Traffic Signal Control Using Lightweight Transformers: An Offline-to-Online RL ApproachCode1
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement LearningCode0
The Generalization Gap in Offline Reinforcement LearningCode1
Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization0
MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman OperatorCode0
Evaluation of Active Feature Acquisition Methods for Static Feature Settings0
Diffused Task-Agnostic Milestone Planner0
H-GAP: Humanoid Control with a Generalist Planner0
SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy EvaluationCode1
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective0
Self-Driving Telescopes: Autonomous Scheduling of Astronomical Observation Campaigns with Offline Reinforcement Learning0
A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning0
Offline Reinforcement Learning for Wireless Network Optimization with Mixture Datasets0
Offline Data Enhanced On-Policy Policy Gradient with Provable GuaranteesCode0
Rethinking Decision Transformer via Hierarchical Reinforcement Learning0
Safety-aware Causal Representation for Trustworthy Offline Reinforcement Learning in Autonomous Driving0
A Tractable Inference Perspective of Offline RL0
Offline RL with Observation Histories: Analyzing and Improving Sample Complexity0
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement LearningCode1
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised LearningCode1
Robust Offline Reinforcement learning with Heavy-Tailed RewardsCode0
Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data CoverageCode0
Show:102550
← PrevPage 6 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified