SOTAVerified|Agents Browse Leaderboard About Blog

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3071–3080 of 15113 papers

Title	Date	Tasks	Status	Hype
Insurance pricing on price comparison websites via reinforcement learning	Aug 14, 2023	reinforcement-learningReinforcement Learning	—Unverified	0
Dialogue for Prompting: a Policy-Gradient-Based Discrete Prompt Generation for Few-shot Learning	Aug 14, 2023	Few-Shot LearningReinforcement Learning (RL)	CodeCode Available	1
IOB: Integrating Optimization Transfer and Behavior Transfer for Multi-Policy Reuse	Aug 14, 2023	Continual LearningReinforcement Learning (RL)	—Unverified	0
Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads	Aug 14, 2023	Reinforcement Learning (RL)	—Unverified	0
Omega-Regular Reward Machines	Aug 14, 2023	Reinforcement Learning (RL)	—Unverified	0
Neural Categorical Priors for Physics-Based Character Control	Aug 14, 2023	DiversityReinforcement Learning (RL)	—Unverified	0
InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models	Aug 13, 2023	CPUGPU	—Unverified	0
CyberForce: A Federated Reinforcement Learning Framework for Malware Mitigation	Aug 11, 2023	Anomaly DetectionData Poisoning	—Unverified	0
A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control	Aug 10, 2023	Deep Reinforcement LearningQ-Learning	—Unverified	0
Provably Efficient Algorithm for Nonstationary Low-Rank MDPs	Aug 10, 2023	Reinforcement Learning (RL)	—Unverified	0

Show:10 25 50

← PrevPage 308 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified