SOTAVerified

Offline RL

Papers

Showing 351360 of 755 papers

TitleStatusHype
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?Code0
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RLCode0
Stabilizing Extreme Q-learning by Maclaurin ExpansionCode0
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models0
UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning0
A Fast Convergence Theory for Offline Decision Making0
Causal prompting model-based offline reinforcement learning0
Inverse Concave-Utility Reinforcement Learning is Inverse Game Theory0
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning0
AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained OptimizationCode0
Show:102550
← PrevPage 36 of 76Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified