SOTAVerified|Agents Browse Leaderboard About

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1211–1220 of 15113 papers

Title	Date	Tasks	Status	Hype	Score
Accelerated Sim-to-Real Deep Reinforcement Learning: Learning Collision Avoidance from Human Player	Feb 21, 2021	Collision AvoidanceDeep Reinforcement Learning	CodeCode Available	1	5
Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement Learning	Aug 11, 2022	continuous-controlContinuous Control	CodeCode Available	1	5
A Distributional Perspective on Reinforcement Learning	Jul 21, 2017	Atari Gamesreinforcement-learning	CodeCode Available	1	5
Building a 3-Player Mahjong AI using Deep Reinforcement Learning	Feb 25, 2022	Deep Reinforcement Learningreinforcement-learning	CodeCode Available	1	5
Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis	Oct 20, 2023	Code GenerationLanguage Modelling	CodeCode Available	1	5
CaiRL: A High-Performance Reinforcement Learning Environment Toolkit	Oct 3, 2022	OpenAI Gymreinforcement-learning	CodeCode Available	1	5
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning	Mar 9, 2023	Offline RLQ-Learning	CodeCode Available	1	5
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models	Sep 13, 2024	Reinforcement Learning (RL)	CodeCode Available	1	5
Automating DBSCAN via Deep Reinforcement Learning	Aug 9, 2022	ClusteringComputational Efficiency	CodeCode Available	1	5
Emergent collective intelligence from massive-agent cooperation and competition	Jan 4, 2023	reinforcement-learningReinforcement Learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 122 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified