SOTAVerified

Offline RL

Papers

Showing 601625 of 755 papers

TitleStatusHype
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters0
Yes, Q-learning Helps Offline In-Context RL0
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments0
You Only Evaluate Once: a Simple Baseline Algorithm for Offline RL0
Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization0
PerSim: Data-Efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators0
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes0
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning0
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning0
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage0
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning0
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity0
2vec: Policy Representations with Successor Features0
Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning0
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone0
Policy-Based Trajectory Clustering in Offline Reinforcement Learning0
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning0
Policy Gradients Incorporating the Future0
Policy-Guided Causal State Representation for Offline Reinforcement Learning Recommendation0
Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning0
Preference Elicitation for Offline Reinforcement Learning0
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning0
Preserving Expert-Level Privacy in Offline Reinforcement Learning0
Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning0
Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning0
Show:102550
← PrevPage 25 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified