SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 61266150 of 15113 papers

TitleStatusHype
Learning to Play Pong using Policy Gradient Learning0
Learning to Play Soccer by Reinforcement and Applying Sim-to-Real to Compete in the Real World0
Learning to Play Table Tennis From Scratch using Muscular Robots0
Learning to Play Two-Player Perfect-Information Games without Knowledge0
Learning to predict where to look in interactive environments using deep recurrent q-learning0
Learning to Program Variational Quantum Circuits with Fast Weights0
Learning to Progressively Plan0
Learning to Provably Satisfy High Relative Degree Constraints for Black-Box Systems0
Learning to Prune Deep Neural Networks via Reinforcement Learning0
Learning to Query Internet Text for Informing Reinforcement Learning Agents0
Learning to Reach Goals Without Reinforcement Learning0
Learning to Reason: Distilling Hierarchy via Self-Supervision and Reinforcement Learning0
Learning to Reason in Large Theories without Imitation0
Learning to Recover Sparse Signals0
Learning to Reinforcement Learn by Imitation0
Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning0
Learning to Represent Action Values as a Hypergraph on the Action Vertices0
Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games0
Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning0
Learning Torque Control for Quadrupedal Locomotion0
Learning to Run challenge: Synthesizing physiologically accurate motion using deep reinforcement learning0
Learning to Run with Potential-Based Reward Shaping and Demonstrations from Video Data0
Learning to Sail Dynamic Networks: The MARLIN Reinforcement Learning Framework for Congestion Control in Tactical Environments0
Learning to sample in Cartesian MRI0
Learning to Sample with Local and Global Contexts in Experience Replay Buffer0
Show:102550
← PrevPage 246 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified