SOTAVerified

Offline RL

Papers

Showing 401425 of 755 papers

TitleStatusHype
Offline Reinforcement Learning with Additional Covering Distributions0
Offline Primal-Dual Reinforcement Learning for Linear MDPs0
FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex ManipulationCode2
Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models0
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning0
SLiC-HF: Sequence Likelihood Calibration with Human Feedback0
Revisiting the Minimalist Approach to Offline Reinforcement LearningCode1
Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage0
Towards Generalizable Reinforcement Learning for Trade Execution0
Explaining RL Decisions with TrajectoriesCode0
Masked Trajectory Models for Prediction, Representation, and ControlCode1
Federated Ensemble-Directed Offline Reinforcement LearningCode1
Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in HealthcareCode1
What can online reinforcement learning with function approximation benefit from general coverage conditions?0
IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion PoliciesCode1
Using Offline Data to Speed Up Reinforcement Learning in Procedurally Generated EnvironmentsCode0
Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning0
Uncertainty-driven Trajectory Truncation for Data Augmentation in Offline Reinforcement LearningCode0
Unified Emulation-Simulation Training Environment for Autonomous Cyber Agents0
Enabling A Network AI Gym for Autonomous Cyber Agents0
Understanding Reinforcement Learning Algorithms: The Progress from Basic Q-learning to Proximal Policy Optimization0
MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from ObservationsCode0
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions0
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value RegularizationCode1
Optimal Transport for Offline Imitation LearningCode1
Show:102550
← PrevPage 17 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified