SOTAVerified

Offline RL

Papers

Showing 726750 of 755 papers

TitleStatusHype
Offline RL Policies Should be Trained to be Adaptive0
Offline RL via Feature-Occupancy Gradient Ascent0
Offline RL with Observation Histories: Analyzing and Improving Sample Complexity0
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints0
Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator0
Offline Trajectory Generalization for Offline Reinforcement Learning0
OffRIPP: Offline RL-based Informative Path Planning0
OmniRL: In-Context Reinforcement Learning by Large-Scale Meta-Training in Randomized Worlds0
Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks0
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation0
On Multi-objective Policy Optimization as a Tool for Reinforcement Learning: Case Studies in Offline RL and Finetuning0
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond0
On the Role of Discount Factor in Offline Reinforcement Learning0
On the Sample Complexity of Vanilla Model-Based Offline Reinforcement Learning with Dependent Samples0
On the Statistical Complexity for Offline and Low-Adaptive Reinforcement Learning with Structures0
Offline Preference-Based Apprenticeship Learning0
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning0
OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators0
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian0
Binary Reward Labeling: Bridging Offline Preference and Reward-Based Reinforcement Learning0
Optimal Single-Policy Sample Complexity and Transient Coverage for Average-Reward Offline RL0
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization0
Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning0
Optimizing Trajectories for Highway Driving with Offline Reinforcement Learning0
Oracle Inequalities for Model Selection in Offline Reinforcement Learning0
Show:102550
← PrevPage 30 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1KFCAverage Reward81.8Unverified
2ADMPOAverage Reward81Unverified
3Decision Transformer (DT)Average Reward73.5Unverified
#ModelMetricClaimedVerifiedStatus
1ParPID4RL Normalized Score151.4Unverified