SOTAVerified

Offline RL

Papers

Showing 2650 of 755 papers

TitleStatusHype
Diffusion Self-Weighted Guidance for Offline Reinforcement Learning0
PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning ProjectsCode0
Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only0
Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies0
Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning0
Think-J: Learning to Think for Generative LLM-as-a-JudgeCode0
Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization0
Prior-Guided Diffusion Planning for Offline Reinforcement Learning0
ImagineBench: Evaluating Reinforcement Learning with Large Language Model RolloutsCode1
Reinforcement Learning for Individual Optimal Policy from Heterogeneous Data0
Feasibility-Aware Pessimistic Estimation: Toward Long-Horizon Safety in Offline RL0
What Matters for Batch Online Reinforcement Learning in Robotics?0
Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains0
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach0
Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning0
Taming OOD Actions for Offline Reinforcement Learning: An Advantage-Based Approach0
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study0
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning0
Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator0
VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning0
A Clean Slate for Offline Reinforcement LearningCode3
Towards Optimal Differentially Private Regret Bounds in Linear MDPs0
Decision SpikeFormer: Spike-Driven Transformer for Decision Making0
Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation0
Offline Reinforcement Learning with Discrete Diffusion Skills0
Show:102550
← PrevPage 2 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified