Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1911–1920 of 15113 papers

Title	Date	Tasks	Status	Hype
SAMBA: Safe Model-Based & Active Reinforcement Learning	Jun 12, 2020	modelReinforcement Learning	CodeCode Available	1
TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search	Jun 12, 2020	Computational chemistryreinforcement-learning	CodeCode Available	1
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning	Jun 11, 2020	Question AnsweringReinforcement Learning (RL)	CodeCode Available	1
Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System	Jun 11, 2020	Hierarchical Reinforcement Learningreinforcement-learning	CodeCode Available	1
What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study	Jun 10, 2020	Attributecontinuous-control	CodeCode Available	1
Robust Spammer Detection by Nash Reinforcement Learning	Jun 10, 2020	Fraud Detectionreinforcement-learning	CodeCode Available	1
Learning to Incentivize Other Learning Agents	Jun 10, 2020	General Reinforcement LearningReinforcement Learning (RL)	CodeCode Available	1
Constrained episodic reinforcement learning in concave-convex and knapsack settings	Jun 9, 2020	reinforcement-learningReinforcement Learning	CodeCode Available	1
Learning to Play No-Press Diplomacy with Best Response Policy Iteration	Jun 8, 2020	Deep Reinforcement LearningReinforcement Learning (RL)	CodeCode Available	1
Reinforcement Learning Under Moral Uncertainty	Jun 8, 2020	Autonomous VehiclesBIG-bench Machine Learning	CodeCode Available	1

Show:10 25 50

← PrevPage 192 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified