SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 66766700 of 15113 papers

TitleStatusHype
DRL-based Slice Placement Under Non-Stationary Conditions0
DRL-based Slice Placement under Realistic Network Load Conditions0
DRL-Clusters: Buffer Management with Clustering based Deep Reinforcement Learning0
Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation0
DRL: Deep Reinforcement Learning for Intelligent Robot Control -- Concept, Literature, and Future0
DRL-FAS: A Novel Framework Based on Deep Reinforcement Learning for Face Anti-Spoofing0
DRL-ISP: Multi-Objective Camera ISP with Deep Reinforcement Learning0
DR-MPC: Deep Residual Model Predictive Control for Real-world Social Navigation0
DROP: Distributional and Regular Optimism and Pessimism for Reinforcement Learning0
DSADF: Thinking Fast and Slow for Decision Making0
DSDF: An approach to handle stochastic agents in collaborative multi-agent reinforcement learning0
DSDF: Coordinated look-ahead strategy in stochastic multi-agent reinforcement learning0
D-Shape: Demonstration-Shaped Reinforcement Learning via Goal Conditioning0
DSP: A Differential Spatial Prediction Scheme for Comprehensive real industrial datasets0
Dual Active Learning for Reinforcement Learning from Human Feedback0
Dual-Agent Deep Reinforcement Learning for Deformable Face Tracking0
Dual Behavior Regularized Reinforcement Learning0
Dual Control for Approximate Bayesian Reinforcement Learning0
Dual Ensemble Kalman Filter for Stochastic Optimal Control0
Dual Generator Offline Reinforcement Learning0
Dual-Objective Reinforcement Learning with Novel Hamilton-Jacobi-Bellman Formulations0
Dueling Deep Q Network for Highway Decision Making in Autonomous Vehicles: A Case Study0
Dueling RL: Reinforcement Learning with Trajectory Preferences0
DyFEn: Agent-Based Fee Setting in Payment Channel Networks0
Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery0
Show:102550
← PrevPage 268 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified