SOTAVerified

Offline RL

Papers

Showing 271280 of 755 papers

TitleStatusHype
Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RLCode0
Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization0
AdaCred: Adaptive Causal Decision Transformers with Feature Crediting0
Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement LearningCode0
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone0
Reinforcement Learning: An Overview0
Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting0
Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback0
Robust Offline Reinforcement Learning with Linearly Structured f-Divergence Regularization0
LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble0
Show:102550
← PrevPage 28 of 76Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified