SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 95019525 of 15113 papers

TitleStatusHype
Regret Bounds for Discounted MDPs0
Regret Bounds for Information-Directed Reinforcement Learning0
Regret Bounds for Learning State Representations in Reinforcement Learning0
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents0
Regret Bounds for Reinforcement Learning via Markov Chain Concentration0
Regret Bounds for Reinforcement Learning with Policy Advice0
Regret Bounds for Risk-Sensitive Reinforcement Learning0
Regret-Free Reinforcement Learning for LTL Specifications0
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function0
Regret-Optimal Q-Learning with Low Cost for Single-Agent and Federated Reinforcement Learning0
Regularization Guarantees Generalization in Bayesian Reinforcement Learning through Algorithmic Stability0
Compositional Transfer in Hierarchical Reinforcement Learning0
Regularized Inverse Reinforcement Learning0
Regularize! Don't Mix: Multi-Agent Reinforcement Learning without Explicit Centralized Structures0
Regularized Parameter Uncertainty for Improving Generalization in Reinforcement Learning0
Regularized Policies are Reward Robust0
Regularized Policy Iteration0
Regularized Q-learning0
Regularizing Action Policies for Smooth Control with Reinforcement Learning0
Regularizing Trajectory Optimization with Denoising Autoencoders0
Regulating Reward Training by Means of Certainty Prediction in a Neural Network-Implemented Pong Game0
REIN-2: Giving Birth to Prepared Reinforcement Learning Agents Using Reinforcement Learning Agents0
ReinDSplit: Reinforced Dynamic Split Learning for Pest Recognition in Precision Agriculture0
ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning0
Reinforce Attack: Adversarial Attack against BERT with Reinforcement Learning0
Show:102550
← PrevPage 381 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified