SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 84018425 of 15113 papers

TitleStatusHype
Dual Behavior Regularized Reinforcement Learning0
Lifelong Robotic Reinforcement Learning by Retaining Experiences0
Greedy UnMixing for Q-Learning in Multi-Agent Reinforcement Learning0
Hindsight Foresight Relabeling for Meta-Reinforcement LearningCode0
Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation0
Deep Reinforcement Learning Based Multidimensional Resource Management for Energy Harvesting Cognitive NOMA Communications0
Coordinated Random Access for Industrial IoT With Correlated Traffic By Reinforcement-Learning0
Decentralized Global Connectivity Maintenance for Multi-Robot Navigation: A Reinforcement Learning Approach0
Carl-Lead: Lidar-based End-to-End Autonomous Driving with Contrastive Deep Reinforcement Learning0
Exploring the Training Robustness of Distributional Reinforcement Learning against Noisy State ObservationsCode0
POAR: Efficient Policy Optimization via Online Abstract State Representation Learning0
Soft Actor-Critic With Integer Actions0
RAPID-RL: A Reconfigurable Architecture with Preemptive-Exits for Efficient Deep-Reinforcement Learning0
Reinforcement Learning on Encrypted Data0
Enabling risk-aware Reinforcement Learning for medical interventions through uncertainty decomposition0
Learning from Peers: Deep Transfer Reinforcement Learning for Joint Radio and Cache Resource Allocation in 5G RAN Slicing0
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning0
Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning0
Balancing detectability and performance of attacks on the control channel of Markov Decision ProcessesCode0
Back to Basics: Deep Reinforcement Learning in Traffic Signal ControlCode0
Estimation of Warfarin Dosage with Reinforcement LearningCode0
DCUR: Data Curriculum for Teaching via Samples with Reinforcement LearningCode0
Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback0
What Does The User Want? Information Gain for Hierarchical Dialogue Policy Optimisation0
Short Quantum Circuits in Reinforcement Learning Policies for the Vehicle Routing Problem0
Show:102550
← PrevPage 337 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified