SOTAVerified|Agents Browse Leaderboard About

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1341–1350 of 15113 papers

Title	Date	Tasks	Status
A Review of Reinforcement Learning in Financial Applications	Nov 1, 2024	BenchmarkingDecision Making	—Unverified
Towards Building Secure UAV Navigation with FHE-aware Knowledge Distillation	Nov 1, 2024	Knowledge DistillationReinforcement Learning (RL)	—Unverified
AI-based traffic analysis in digital twin networks	Nov 1, 2024	FairnessFederated Learning	—Unverified
Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayes Theory	Nov 1, 2024	reinforcement-learningReinforcement Learning	—Unverified
Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions	Nov 1, 2024	Bayesian InferenceOffline RL	CodeCode Available
Effective ML Model Versioning in Edge Networks	Nov 1, 2024	modelreinforcement-learning	—Unverified
EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization	Oct 31, 2024	Bayesian OptimizationDecision Making	—Unverified
Scalable Reinforcement Post-Training Beyond Static Human Prompts: Evolving Alignment via Asymmetric Self-Play	Oct 31, 2024	Reinforcement Learning (RL)	—Unverified
Maximum Entropy Hindsight Experience Replay	Oct 31, 2024	reinforcement-learningReinforcement Learning	—Unverified
Deterministic Exploration via Stationary Bellman Error Maximization	Oct 31, 2024	Reinforcement Learning (RL)	—Unverified

Show:10 25 50

← PrevPage 135 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified