SOTAVerified

Offline RL

Papers

Showing 251275 of 755 papers

TitleStatusHype
FOSP: Fine-tuning Offline Safe Policy through World Models0
Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning0
From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning0
Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation0
End-to-end Offline Reinforcement Learning for Glycemia Control0
End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient0
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization0
Enabling A Network AI Gym for Autonomous Cyber Agents0
Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL0
Augmenting Offline RL with Unlabeled Data0
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL0
CLUE: Calibrated Latent Guidance for Offline Reinforcement Learning0
Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only0
A Fast Convergence Theory for Offline Decision Making0
A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning0
InferNet for Delayed Reinforcement Tasks: Addressing the Temporal Credit Assignment Problem0
ChiPFormer: Transferable Chip Placement via Offline Decision Transformer0
Efficient Imitation Learning with Conservative World Models0
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings0
Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning0
Dual Generator Offline Reinforcement Learning0
A Survey on Model-based Reinforcement Learning0
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning0
Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions0
DRDT3: Diffusion-Refined Decision Test-Time Training Model0
Show:102550
← PrevPage 11 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified