SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 50515075 of 15113 papers

TitleStatusHype
Reinforcement Learning to Rank with Coarse-grained Labels0
Reinforcement Learning to Solve NP-hard Problems: an Application to the CVRP0
Reinforcement Learning Tracking Control for Robotic Manipulator With Kernel-Based Dynamic Model0
Reinforcement Learning: Tutorial and Survey0
Reinforcement Learning Under Algorithmic Triage0
Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory0
Non-Stationary Reinforcement Learning: The Blessing of (More) Optimism0
Reinforcement Learning under Model Mismatch0
Reinforcement Learning under Partial Observability Guided by Learned Environment Models0
Reinforcement Learning Under Probabilistic Spatio-Temporal Constraints with Time Windows0
Reinforcement Learningx2013Based Transient Response Shaping for Microgrids0
Reinforcement Learning using Augmented Neural Networks0
Reinforcement learning using Deep Q Networks and Q learning accurately localizes brain tumors on MRI with very small training sets0
Reinforcement Learning using Guided Observability0
Reinforcement Learning using Kernel-Based Stochastic Factorization0
Reinforcement Learning Using Quantum Boltzmann Machines0
Reinforcement Learning via AIXI Approximation0
Reinforcement Learning via Gaussian Processes with Neural Network Dual Kernels0
Reinforcement Learning via Reasoning from Demonstration0
Reinforcement Learning via Replica Stacking of Quantum Measurements for the Training of Quantum Boltzmann Machines0
Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Robot Learning0
Reinforcement Learning with Adaptive Curriculum Dynamics Randomization for Fault-Tolerant Robot Control0
Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation0
Reinforcement Learning with Almost Sure Constraints0
Reinforcement Learning with Analogical Similarity to Guide Schema Induction and Attention0
Show:102550
← PrevPage 203 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified