SOTAVerified

Offline RL

Papers

Showing 676700 of 755 papers

TitleStatusHype
Policy-regularized Offline Multi-objective Reinforcement LearningCode0
POPO: Pessimistic Offline Policy OptimizationCode0
d3rlpy: An Offline Deep Reinforcement Learning LibraryCode0
Preference-Guided Reflective Sampling for Aligning Language ModelsCode0
MOBODY: Model Based Off-Dynamics Offline Reinforcement LearningCode0
Using Offline Data to Speed Up Reinforcement Learning in Procedurally Generated EnvironmentsCode0
Offline Equilibrium FindingCode0
A Connection between One-Step Regularization and Critic Regularization in Reinforcement LearningCode0
Offline Data Enhanced On-Policy Policy Gradient with Provable GuaranteesCode0
NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network SimulationCode0
The Pump Scheduling Problem: A Real-World Scenario for Reinforcement LearningCode0
Semi-Markov Offline Reinforcement Learning for HealthcareCode0
Semi-Offline Reinforcement Learning for Optimized Text GenerationCode0
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement LearningCode0
DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under UncertaintyCode0
Building Persona Consistent Dialogue Agents with Offline Reinforcement LearningCode0
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function ApproximationCode0
The Role of Deep Learning Regularizations on Actors in Offline RLCode0
Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data CorruptionsCode0
Uncertainty-driven Trajectory Truncation for Data Augmentation in Offline Reinforcement LearningCode0
Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement LearningCode0
PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning ProjectsCode0
Mutual Information Regularized Offline Reinforcement LearningCode0
Think-J: Learning to Think for Generative LLM-as-a-JudgeCode0
Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data CoverageCode0
Show:102550
← PrevPage 28 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified