Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1571–1580 of 15113 papers

Title	Date	Tasks	Status	Hype
Collision Probability Distribution Estimation via Temporal Difference Learning	Jul 29, 2024	AI AgentAutonomous Driving	CodeCode Available	1
Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization	Oct 14, 2022	Few-Shot Imitation LearningReinforcement Learning (RL)	CodeCode Available	1
Local policy search with Bayesian optimization	Jun 22, 2021	Bayesian OptimizationReinforcement Learning (RL)	CodeCode Available	1
Logically-Constrained Reinforcement Learning	Jan 24, 2018	Decision MakingDecision Making Under Uncertainty	CodeCode Available	1
Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning	Nov 12, 2019	Combinatorial OptimizationGraph Embedding	CodeCode Available	1
Learning to combine primitive skills: A step towards versatile robotic manipulation	Aug 2, 2019	Data AugmentationImitation Learning	CodeCode Available	1
Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline RL	Jun 7, 2023	Data AugmentationOffline RL	CodeCode Available	1
Look where you look! Saliency-guided Q-networks for generalization in visual Reinforcement Learning	Sep 16, 2022	Deep Reinforcement Learningreinforcement-learning	CodeCode Available	1
Collaborative Multi-Agent Dialogue Model Training Via Reinforcement Learning	Jul 11, 2019	Natural Language Understandingreinforcement-learning	CodeCode Available	1
An End-to-End Reinforcement Learning Approach for Job-Shop Scheduling Problems Based on Constraint Programming	Jun 9, 2023	Combinatorial OptimizationFeature Engineering	CodeCode Available	1

Show:10 25 50

← PrevPage 158 of 1512Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified