SOTAVerified|Agents Browse Leaderboard About Blog

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3851–3860 of 15113 papers

Title	Date	Tasks	Status
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning	Jun 7, 2024	Deep Reinforcement LearningReinforcement Learning (RL)	—Unverified
Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking	Jun 6, 2024	reinforcement-learningReinforcement Learning	—Unverified
Proofread: Fixes All Errors with One Tap	Jun 6, 2024	AllQuantization	—Unverified
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models	Jun 6, 2024	Offline RLreinforcement-learning	—Unverified
Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning	Jun 6, 2024	reinforcement-learningReinforcement Learning	CodeCode Available
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories	Jun 6, 2024	Data Augmentationreinforcement-learning	—Unverified
Towards Dynamic Trend Filtering through Trend Point Detection with Reinforcement Learning	Jun 6, 2024	Reinforcement Learning (RL)Time Series	CodeCode Available
Breeding Programs Optimization with Reinforcement Learning	Jun 6, 2024	reinforcement-learningReinforcement Learning	—Unverified
Bootstrapping Expectiles in Reinforcement Learning	Jun 6, 2024	Q-Learningreinforcement-learning	—Unverified
Optimizing Autonomous Driving for Safety: A Human-Centric Approach with LLM-Enhanced RLHF	Jun 6, 2024	Autonomous Drivingreinforcement-learning	—Unverified

Show:10 25 50

← PrevPage 386 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified