SOTAVerified|Agents Browse Leaderboard About Blog

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3171–3180 of 15113 papers

Title	Date	Tasks	Status	Hype
Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning	Jul 7, 2023	Contrastive Learningreinforcement-learning	CodeCode Available	1
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment	Jul 7, 2023	Reinforcement Learning (RL)	CodeCode Available	2
Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation and Human Feedback	Jul 6, 2023	Decision MakingLEMMA	—Unverified	0
Offline Reinforcement Learning with Imbalanced Datasets	Jul 6, 2023	D4RLOffline RL	—Unverified	0
Learning Multi-Agent Intention-Aware Communication for Optimal Multi-Order Execution in Finance	Jul 6, 2023	Reinforcement Learning (RL)	—Unverified	0
A Neuromorphic Architecture for Reinforcement Learning from Real-Valued Observations	Jul 6, 2023	AcrobotDecision Making	—Unverified	0
Dynamic Observation Policies in Observation Cost-Sensitive Reinforcement Learning	Jul 5, 2023	OpenAI Gymreinforcement-learning	CodeCode Available	0
Generative Job Recommendations with Large Language Model	Jul 5, 2023	Collaborative FilteringLanguage Modeling	—Unverified	0
LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning	Jul 5, 2023	Offline RLQ-Learning	—Unverified	0
First-Explore, then Exploit: Meta-Learning to Solve Hard Exploration-Exploitation Trade-Offs	Jul 5, 2023	Meta-LearningReinforcement Learning (RL)	CodeCode Available	1

Show:10 25 50

← PrevPage 318 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified