SOTAVerified|Agents Browse Leaderboard About Blog

Offline RL

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 755 papers

Title	Date	Tasks	Status	Hype	Score
Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning	May 27, 2024	Gym halfcheetah-mediumGym halfcheetah-medium-expert	CodeCode Available	2	5
Dungeons and Data: A Large-Scale NetHack Dataset	Nov 1, 2022	Decision MakingNetHack	CodeCode Available	2	5
Diffusion Guidance Is a Controllable Policy Improvement Operator	May 29, 2025	Offline RL	CodeCode Available	2	5
FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation	May 22, 2023	Imitation LearningMotion Planning	CodeCode Available	2	5
Flowformer: Linearizing Transformers with Conservation Flows	Feb 13, 2022	D4RLOffline RL	CodeCode Available	2	5
LongReward: Improving Long-context Large Language Models with AI Feedback	Oct 28, 2024	Offline RLReinforcement Learning (RL)	CodeCode Available	2	5
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations	Jun 9, 2022	Benchmarkingcontinuous-control	CodeCode Available	2	5
Pretrained LLM Adapted with LoRA as a Decision Transformer for Offline RL in Quantitative Trading	Nov 26, 2024	Offline RLparameter-efficient fine-tuning	CodeCode Available	2	5
CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning	Apr 18, 2022	ChatbotOffline RL	CodeCode Available	2	5
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning	Jun 17, 2022	Few-Shot LearningOffline RL	CodeCode Available	2	5

Show:10 25 50

← PrevPage 3 of 76Next →

All datasets D4RL Walker2d

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	KFC	Average Reward	81.8	—	Unverified
2	ADMPO	Average Reward	81	—	Unverified
3	Decision Transformer (DT)	Average Reward	73.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ParPI	D4RL Normalized Score	151.4	—	Unverified