SOTAVerified

Offline RL

Papers

Showing 401425 of 755 papers

TitleStatusHype
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices0
Offline Actor-Critic Reinforcement Learning Scales to Large Models0
Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning0
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs0
Contrastive Diffuser: Planning Towards High Return States via Contrastive Learning0
The Virtues of Pessimism in Inverse Reinforcement Learning0
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching0
Adaptive Q-Aid for Conditional Supervised Learning in Offline Reinforcement Learning0
Context-Former: Stitching via Latent Conditioned Sequence Modeling0
Multi-Object Navigation in real environments using hybrid policies0
MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning0
Solving Offline Reinforcement Learning with Decision Tree RegressionCode0
Harnessing Density Ratios for Online Reinforcement Learning0
DiffClone: Enhanced Behaviour Cloning in Robotics with Diffusion-Driven Policy LearningCode0
Learning from Sparse Offline Datasets via Conservative Density EstimationCode0
Solving Continual Offline Reinforcement Learning with Decision Transformer0
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization0
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond0
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning0
SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement LearningCode0
Policy-regularized Offline Multi-objective Reinforcement LearningCode0
POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement LearningCode0
Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning0
Neural Network Approximation for Pessimistic Offline Reinforcement Learning0
CUDC: A Curiosity-Driven Unsupervised Data Collection Method with Adaptive Temporal Distances for Offline Reinforcement Learning0
Show:102550
← PrevPage 17 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified