SOTAVerified

Offline RL

Papers

Showing 651675 of 755 papers

TitleStatusHype
Representation Balancing Offline Model-based Reinforcement Learning0
Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RLCode0
ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender SystemsCode0
Learning to Control Autonomous Fleets from Observation via Offline Reinforcement LearningCode0
Learning from Sparse Offline Datasets via Conservative Density EstimationCode0
S2P: State-conditioned Image Synthesis for Data Augmentation in Offline Reinforcement LearningCode0
On the Effectiveness of Offline RL for Dialogue Response GenerationCode0
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement LearningCode0
Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement LearningCode0
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?Code0
On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical EfficiencyCode0
AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained OptimizationCode0
Off-policy Evaluation in Doubly Inhomogeneous EnvironmentsCode0
Offline RL with Smooth OOD Generalization in Convex Hull and its NeighborhoodCode0
Offline RL With Resource Constrained Online DeploymentCode0
Scalable Decision-Making in Stochastic Environments through Learned Temporal AbstractionCode0
POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement LearningCode0
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?Code0
CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy OptimizationCode0
Policy Constraint by Only Support Constraint for Offline Reinforcement LearningCode0
DCUR: Data Curriculum for Teaching via Samples with Reinforcement LearningCode0
Fat-to-Thin Policy Optimization: Offline RL with Sparse PoliciesCode0
Explaining RL Decisions with TrajectoriesCode0
Experimental evaluation of offline reinforcement learning for HVAC control in buildingsCode0
Offline Reinforcement Learning from Datasets with Structured Non-StationarityCode0
Show:102550
← PrevPage 27 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified