SOTAVerified

Offline RL

Papers

Showing 651700 of 755 papers

TitleStatusHype
Launchpad: Learning to Schedule Using Offline and Online RL Methods0
Learning Dexterous Manipulation from Suboptimal Experts0
Learning Pseudometric-based Action Representations for Offline Reinforcement Learning0
Learning to Clarify by Reinforcement Learning Through Reward-Weighted Fine-Tuning0
Learning to Influence Human Behavior with Offline Reinforcement Learning0
Learning to View: Decision Transformers for Active Object Detection0
Learning Value Functions from Undirected State-only Experience0
Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains0
Leveraging Offline Data in Online Reinforcement Learning0
Leveraging Optimal Transport for Enhanced Offline Reinforcement Learning in Surgical Robotic Environments0
LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble0
LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning0
Language Decision Transformers with Exponential Tilt for Interactive Text Environments0
Measurement Scheduling for ICU Patients with Offline Reinforcement Learning0
Minimax Optimal and Computationally Efficient Algorithms for Distributionally Robust Offline Reinforcement Learning0
Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning0
Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization0
Model-Based Offline Planning0
Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation0
Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds0
Model-enhanced Contrastive Reinforcement Learning for Sequential Recommendation0
Model Generation with Provable Coverability for Offline Reinforcement Learning0
MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning0
MOORL: A Framework for Integrating Offline-Online Reinforcement Learning0
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning0
MOReL: Model-Based Offline Reinforcement Learning0
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning0
Multi-Objective Decision Transformers for Offline Reinforcement Learning0
Multi-Objective-Optimization Multi-AUV Assisted Data Collection Framework for IoUT Based on Offline Reinforcement Learning0
Multi-Object Navigation in real environments using hybrid policies0
Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning0
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game0
Near-Optimal Offline Reinforcement Learning via Double Variance Reduction0
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning0
Neural Network Approximation for Pessimistic Offline Reinforcement Learning0
Off-dynamics Conditional Diffusion Planners0
OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control0
Offline Actor-Critic Reinforcement Learning Scales to Large Models0
Offline Evaluation for Reinforcement Learning-based Recommendation: A Critical Issue and Some Alternatives0
Offline Fictitious Self-Play for Competitive Games0
Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies0
Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare0
Offline Inverse Reinforcement Learning0
Offline Model-Based Reinforcement Learning with Anti-Exploration0
Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization0
Offline Multi-task Transfer RL with Representational Penalization0
Offline Policy Evaluation and Optimization under Confounding0
Offline Policy Evaluation for Reinforcement Learning with Adaptively Collected Data0
Offline Policy Optimization in RL with Variance Regularizaton0
Offline Policy Optimization with Variance Regularization0
Show:102550
← PrevPage 14 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified