SOTAVerified

Offline RL

Papers

Showing 351375 of 755 papers

TitleStatusHype
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?Code0
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RLCode0
Stabilizing Extreme Q-learning by Maclaurin ExpansionCode0
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models0
UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning0
A Fast Convergence Theory for Offline Decision Making0
Causal prompting model-based offline reinforcement learning0
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning0
Inverse Concave-Utility Reinforcement Learning is Inverse Game Theory0
AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained OptimizationCode0
Unified Preference Optimization: Language Model Alignment Beyond the Preference Frontier0
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear q^π-Realizability and Concentrability0
OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators0
Exclusively Penalized Q-learning for Offline Reinforcement Learning0
Offline Reinforcement Learning from Datasets with Structured Non-StationarityCode0
Offline RL via Feature-Occupancy Gradient Ascent0
Efficient Imitation Learning with Conservative World Models0
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?Code0
Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defenses0
Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning0
Improving Offline Reinforcement Learning with Inaccurate Simulators0
Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning0
Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows0
Generalize by Touching: Tactile Ensemble Skill Transfer for Robotic Furniture Assembly0
Offline Reinforcement Learning with Behavioral Supervisor Tuning0
Show:102550
← PrevPage 15 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified