SOTAVerified|Agents Browse Leaderboard About Blog

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3061–3070 of 15113 papers

Title	Date	Tasks	Status
Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based on Maximum Entropy	Jun 10, 2022	Deep Reinforcement LearningMulti-agent Reinforcement Learning	—Unverified
Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications	Feb 15, 2023	Decision MakingManagement	—Unverified
A survey on intrinsic motivation in reinforcement learning	Aug 19, 2019	reinforcement-learningReinforcement Learning	—Unverified
A Survey on Interpretable Reinforcement Learning	Dec 24, 2021	Autonomous DrivingDecision Making	—Unverified
Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations	Jun 22, 2021	reinforcement-learningReinforcement Learning	—Unverified
A Survey on GUI Agents with Foundation Models Enhanced by Reinforcement Learning	Apr 29, 2025	Action GenerationPrompt Engineering	—Unverified
Adaptation of Quadruped Robot Locomotion with Meta-Learning	Jul 8, 2021	Meta-LearningMeta Reinforcement Learning	—Unverified
A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback	Nov 20, 2024	Decision MakingReinforcement Learning (RL)	—Unverified
A Survey on Dialog Management: Recent Advances and Challenges	May 5, 2020	ManagementReinforcement Learning (RL)	—Unverified
Adaptable Recovery Behaviors in Robotics: A Behavior Trees and Motion Generators(BTMG) Approach for Failure Management	Apr 9, 2024	ManagementReinforcement Learning (RL)	—Unverified

Show:10 25 50

← PrevPage 307 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified