SOTAVerified

D4RL

Papers

Showing 150 of 226 papers

TitleStatusHype
Flow Q-LearningCode3
CORL: Research-oriented Deep Offline Reinforcement Learning LibraryCode3
Skill Expansion and Composition in Parameter SpaceCode2
Flowformer: Linearizing Transformers with Conservation FlowsCode2
Datasets and Benchmarks for Offline Safe Reinforcement LearningCode2
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement LearningCode2
Reformer: The Efficient TransformerCode2
Rethinking Attention with PerformersCode2
Online Decision TransformerCode2
D4RL: Datasets for Deep Data-Driven Reinforcement LearningCode2
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement LearningCode1
Offline Reinforcement Learning with In-sample Q-LearningCode1
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value RegularizationCode1
Optimal Transport for Offline Imitation LearningCode1
Model-Bellman Inconsistency for Model-based Offline Reinforcement LearningCode1
Offline Reinforcement Learning with Value-based Episodic MemoryCode1
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-PerformerCode1
Katakomba: Tools and Benchmarks for Data-Driven NetHackCode1
Implicit Behavioral CloningCode1
Habitizing Diffusion Planning for Efficient and Effective Decision MakingCode1
Aligning Diffusion Behaviors with Q-functions for Efficient Continuous ControlCode1
In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-ThoughtCode1
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement LearningCode1
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement LearningCode1
Offline Reinforcement Learning via High-Fidelity Generative Behavior ModelingCode1
Offline Reinforcement Learning with Implicit Q-LearningCode1
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced DatasetsCode1
Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration BiasCode1
Offline RL Without Off-Policy EvaluationCode1
Improving and Benchmarking Offline Reinforcement Learning AlgorithmsCode1
M^3PC: Test-time Model Predictive Control for Pretrained Masked Trajectory ModelCode1
Exploration and Anti-Exploration with Distributional Random Network DistillationCode1
Conservative Offline Distributional Reinforcement LearningCode1
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement LearningCode1
Efficient Diffusion Policies for Offline Reinforcement LearningCode1
Extreme Q-Learning: MaxEnt RL without EntropyCode1
Are Expressive Models Truly Necessary for Offline RL?Code1
Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement LearningCode1
Adversarially Trained Actor Critic for Offline Reinforcement LearningCode1
Decision Transformer: Reinforcement Learning via Sequence ModelingCode1
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement LearningCode1
Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement LearningCode1
A Policy-Guided Imitation Approach for Offline Reinforcement LearningCode1
cosFormer: Rethinking Softmax in AttentionCode1
Critic-Guided Decision Transformer for Offline Reinforcement LearningCode1
CROP: Conservative Reward for Model-based Offline Policy OptimizationCode1
Curricular Subgoals for Inverse Reinforcement LearningCode1
Behavior Proximal Policy OptimizationCode1
When should we prefer Decision Transformers for Offline Reinforcement Learning?Code1
Anti-Exploration by Random Network DistillationCode1
Show:102550
← PrevPage 1 of 5Next →

No leaderboard results yet.