SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 83018325 of 15113 papers

TitleStatusHype
^2-exploration for Reinforcement Learning0
Superior Performance with Diversified Strategic Control in FPS Games Using General Reinforcement Learning0
Mitigation of Adversarial Policy Imitation via Constrained Randomization of Policy (CRoP)0
Polyphonic Music Composition: An Adversarial Inverse Reinforcement Learning Approach0
Uncertainty Regularized Policy Learning for Offline Reinforcement Learning0
Understanding and Leveraging Overparameterization in Recursive Value Estimation0
Understanding the Generalization Gap in Visual Reinforcement Learning0
Reasoning With Hierarchical Symbols: Reclaiming Symbolic Policies For Visual Reinforcement Learning0
Structure-Aware Transformer Policy for Inhomogeneous Multi-Task Reinforcement Learning0
Metrics Matter: A Closer Look on Self-Paced Reinforcement Learning0
MURO: Deployment Constrained Reinforcement Learning with Model-based Uncertainty Regularized Batch Optimization0
Untangling Braids with Multi-agent Q-Learning0
Pretraining for Language Conditioned Imitation with Transformers0
Online Robust Reinforcement Learning with Model Uncertainty0
Meta Attention For Off-Policy Actor-Critic0
SAFER: Data-Efficient and Safe Reinforcement Learning Through Skill Acquisition0
Value Refinement Network (VRN)0
PDQN - A Deep Reinforcement Learning Method for Planning with Long Delays: Optimization of Manufacturing Dispatching0
Variational oracle guiding for reinforcement learning0
Online Tuning for Offline Decentralized Multi-Agent Reinforcement Learning0
Reinforcement Learning State Estimation for High-Dimensional Nonlinear Systems0
Safe Exploration in Linear Equality Constraint0
State-Action Joint Regularized Implicit Policy for Offline Reinforcement Learning0
Reinforcement Learning with Predictive Consistent Representations0
Maximizing Ensemble Diversity in Deep Reinforcement Learning0
Show:102550
← PrevPage 333 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified