SOTAVerified|Agents Browse Leaderboard About Blog

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2671–2680 of 15113 papers

Title	Date	Tasks	Status
Learning to Rewrite Prompts for Personalized Text Generation	Sep 29, 2023	Language ModellingLarge Language Model	—Unverified
Automatic Poetry Generation with Mutual Reinforcement Learning	Oct 1, 2018	reinforcement-learningReinforcement Learning	—Unverified
Adaptive Intelligent Secondary Control of Microgrids Using a Biologically-Inspired Reinforcement Learning	May 2, 2019	reinforcement-learningReinforcement Learning	—Unverified
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning	Dec 22, 2024	D4RLQ-Learning	—Unverified
Automatic, Personalized, and Flexible Playlist Generation using Reinforcement Learning	Sep 12, 2018	DiversityLanguage Modeling	—Unverified
A Local Temporal Difference Code for Distributional Reinforcement Learning	Dec 1, 2020	Distributional Reinforcement LearningImputation	—Unverified
Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar	May 24, 2019	AutoMLBayesian Optimization	—Unverified
Automatic low-bit hybrid quantization of neural networks through meta learning	Apr 24, 2020	Meta-LearningQuantization	—Unverified
Almost Optimal Model-Free Reinforcement Learningvia Reference-Advantage Decomposition	Dec 1, 2020	reinforcement-learningReinforcement Learning	—Unverified
Adaptive Insurance Reserving with CVaR-Constrained Reinforcement Learning under Macroeconomic Regimes	Apr 13, 2025	Reinforcement Learning (RL)	—Unverified

Show:10 25 50

← PrevPage 268 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified