SOTAVerified|Agents Browse Leaderboard About Blog

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1961–1970 of 15113 papers

Title	Date	Tasks	Status	Hype
Implementation Matters in Deep RL: A Case Study on PPO and TRPO	May 1, 2020	Deep Reinforcement Learningreinforcement-learning	CodeCode Available	1
RaCT: Toward Amortized Ranking-Critical Training For Collaborative Filtering	May 1, 2020	Collaborative FilteringLearning-To-Rank	CodeCode Available	1
Deep Symbolic Superoptimization Without Human Knowledge	May 1, 2020	Decoderreinforcement-learning	CodeCode Available	1
Logic and the 2-Simplicial Transformer	May 1, 2020	Deep Reinforcement LearningInductive Bias	CodeCode Available	1
Learning Collaborative Agents with Rule Guidance for Knowledge Graph Reasoning	May 1, 2020	reinforcement-learningReinforcement Learning (RL)	CodeCode Available	1
Reinforcement Learning with Augmented Data	Apr 30, 2020	Data AugmentationOpenAI Gym	CodeCode Available	1
Actor-Critic Reinforcement Learning for Control with Stability Guarantee	Apr 29, 2020	Motion Planningreinforcement-learning	CodeCode Available	1
Hierarchical Reinforcement Learning for Automatic Disease Diagnosis	Apr 29, 2020	Hierarchical Reinforcement Learningreinforcement-learning	CodeCode Available	1
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels	Apr 28, 2020	AllAtari Games 100k	CodeCode Available	1
Transferable Active Grasping and Real Embodied Dataset	Apr 28, 2020	Reinforcement LearningReinforcement Learning (RL)	CodeCode Available	1

Show:10 25 50

← PrevPage 197 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified