SOTAVerified

Offline RL

Papers

Showing 221230 of 755 papers

TitleStatusHype
Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RLCode0
GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning0
Diffusion Self-Weighted Guidance for Offline Reinforcement Learning0
Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only0
PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning ProjectsCode0
Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies0
Think-J: Learning to Think for Generative LLM-as-a-JudgeCode0
Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning0
Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization0
Prior-Guided Diffusion Planning for Offline Reinforcement Learning0
Show:102550
← PrevPage 23 of 76Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified