SOTAVerified

D4RL

Papers

Showing 150 of 226 papers

TitleStatusHype
From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning0
Accelerating Residual Reinforcement Learning with Uncertainty Estimation0
CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy OptimizationCode0
MOORL: A Framework for Integrating Offline-Online Reinforcement Learning0
Policy-Based Trajectory Clustering in Offline Reinforcement Learning0
Offline RL with Smooth OOD Generalization in Convex Hull and its NeighborhoodCode0
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation0
Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RLCode0
Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning0
Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning0
Imagination-Limited Q-Learning for Offline Reinforcement Learning0
Beyond the Known: Decision Making with Counterfactual Reasoning Decision TransformerCode0
Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning0
Taming OOD Actions for Offline Reinforcement Learning: An Advantage-Based Approach0
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning0
Directly Forecasting Belief for Reinforcement Learning with DelaysCode0
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning0
VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning0
Decision SpikeFormer: Spike-Driven Transformer for Decision Making0
Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation0
Diverse Transformer Decoding for Offline Reinforcement Learning Using Financial Algorithmic Approaches0
Habitizing Diffusion Planning for Efficient and Effective Decision MakingCode1
Skill Expansion and Composition in Parameter SpaceCode2
Behavior-Regularized Diffusion Policy Optimization for Offline Reinforcement Learning0
Flow Q-LearningCode3
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network0
Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning0
DRDT3: Diffusion-Refined Decision Test-Time Training Model0
SALE-Based Offline Reinforcement Learning with Ensemble Q-Networks0
SR-Reward: Taking The Path More Traveled0
Goal-Conditioned Data Augmentation for Offline Reinforcement Learning0
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning0
Are Expressive Models Truly Necessary for Offline RL?Code1
M^3PC: Test-time Model Predictive Control for Pretrained Masked Trajectory ModelCode1
Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting0
Learning on One Mode: Addressing Multi-Modality in Offline Reinforcement LearningCode0
Enhancing Decision Transformer with Diffusion-Based Trajectory Branch Generation0
Constrained Latent Action Policies for Model-Based Offline Reinforcement LearningCode0
Hypercube Policy Regularization Framework for Offline Reinforcement LearningCode0
NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network SimulationCode0
Offline Behavior DistillationCode0
Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning0
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency modelCode0
SAMG: State-Action-Aware Offline-to-Online Reinforcement Learning with Offline Model Guidance0
RGMDT: Return-Gap-Minimizing Decision Tree Extraction in Non-Euclidean Metric Space0
Rethinking Optimal Transport in Offline Reinforcement Learning0
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement LearningCode0
Diffusion Model Predictive Control0
KAN v.s. MLP for Offline Reinforcement Learning0
Planning Transformer: Long-Horizon Offline Reinforcement Learning with Planning Tokens0
Show:102550
← PrevPage 1 of 5Next →

No leaderboard results yet.