SOTAVerified

Offline RL

Papers

Showing 251300 of 755 papers

TitleStatusHype
Energy-Weighted Flow Matching for Offline Reinforcement Learning0
Scalable Decision-Making in Stochastic Environments through Learned Temporal AbstractionCode0
Yes, Q-learning Helps Offline In-Context RL0
Enhancing Offline Model-Based RL via Active Model Selection: A Bayesian Optimization Perspective0
Which Features are Best for Successor Features?0
Diverse Transformer Decoding for Offline Reinforcement Learning Using Financial Algorithmic Approaches0
Active Advantage-Aligned Online Reinforcement Learning with Offline DataCode0
Enhancing Pre-Trained Decision Transformers with Prompt-Tuning Bandits0
Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning0
OmniRL: In-Context Reinforcement Learning by Large-Scale Meta-Training in Randomized Worlds0
Policy-Guided Causal State Representation for Offline Reinforcement Learning Recommendation0
Resilient UAV Trajectory Planning via Few-Shot Meta-Offline Reinforcement Learning0
Flexible Blood Glucose Control: Offline Reinforcement Learning from Human Feedback0
Data Center Cooling System Optimization Using Offline Reinforcement Learning0
Fat-to-Thin Policy Optimization: Offline RL with Sparse PoliciesCode0
Large Language Model driven Policy Exploration for Recommender Systems0
DRDT3: Diffusion-Refined Decision Test-Time Training Model0
SR-Reward: Taking The Path More Traveled0
On the Statistical Complexity for Offline and Low-Adaptive Reinforcement Learning with Structures0
Goal-Conditioned Data Augmentation for Offline Reinforcement Learning0
Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RLCode0
Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization0
AdaCred: Adaptive Causal Decision Transformers with Feature Crediting0
Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement LearningCode0
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone0
Reinforcement Learning: An Overview0
Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting0
Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback0
Robust Offline Reinforcement Learning with Linearly Structured f-Divergence Regularization0
LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble0
PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement0
Continual Task Learning through Adaptive Policy Self-CompositionCode0
Preserving Expert-Level Privacy in Offline Reinforcement Learning0
Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning0
Streetwise Agents: Empowering Offline RL Policies to Outsmart Exogenous Stochastic Disturbances in RTC0
OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control0
Real-World Offline Reinforcement Learning from Vision Language Model Feedback0
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning0
Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data CorruptionsCode0
Offline Reinforcement Learning and Sequence Modeling for Downlink Link Adaptation0
NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network SimulationCode0
Learning Versatile Skills with Curriculum MaskingCode0
Offline reinforcement learning for job-shop scheduling problems0
Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces0
Off-dynamics Conditional Diffusion Planners0
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement LearningCode0
Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task0
DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation0
Multi-Objective-Optimization Multi-AUV Assisted Data Collection Framework for IoUT Based on Offline Reinforcement Learning0
Integrating Reinforcement Learning and Large Language Models for Crop Production Process Management Optimization and Control through A New Knowledge-Based Deep Learning Paradigm0
Show:102550
← PrevPage 6 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified