SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1137611400 of 15113 papers

TitleStatusHype
Hindsight Curriculum Generation Based Multi-Goal Experience Replay0
Hindsight Expectation Maximization for Goal-conditioned Reinforcement Learning0
Hindsight Generative Adversarial Imitation Learning0
Hindsight Reward Tweaking via Conditional Deep Reinforcement Learning0
Hindsight States: Blending Sim and Real Task Elements for Efficient Reinforcement Learning0
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective0
HIPPOCAMPAL NEURONAL REPRESENTATIONS IN CONTINUAL LEARNING0
Historical Text Normalization with Delayed Rewards0
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation0
Lagrangian-based online safe reinforcement learning for state-constrained systems0
HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints0
HLIC: Harmonizing Optimization Metrics in Learned Image Compression by Reinforcement Learning0
Holistic Deep-Reinforcement-Learning-based Training of Autonomous Navigation Systems0
HoME: a Household Multimodal Environment0
Homotopy Based Reinforcement Learning with Maximum Entropy for Autonomous Air Combat0
Hope For The Best But Prepare For The Worst: Cautious Adaptation In RL Agents0
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare0
Horizon: Facebook's Open Source Applied Reinforcement Learning Platform0
Horizon-Free Regret for Linear Markov Decision Processes0
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes0
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies0
Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs0
Hovering Flight of Soft-Actuated Insect-Scale Micro Aerial Vehicles using Deep Reinforcement Learning0
How an Electrical Engineer Became an Artificial Intelligence Researcher, a Multiphase Active Contours Analysis0
How Can Creativity Occur in Multi-Agent Systems?0
Show:102550
← PrevPage 456 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified