Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4001–4025 of 15113 papers

Title	Date	Tasks	Status	Score
Fast deep reinforcement learning using online adjustments from the past	Oct 18, 2018	Atari GamesDeep Reinforcement Learning	CodeCode Available	5
Bayesian Inference with Anchored Ensembles of Neural Networks, and Application to Exploration in Reinforcement Learning	May 29, 2018	Bayesian Inferencereinforcement-learning	CodeCode Available	5
Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search	Jan 22, 2019	Neural Architecture SearchReinforcement Learning	CodeCode Available	5
Bayesian Inverse Reinforcement Learning for Collective Animal Movement	Sep 8, 2020	reinforcement-learningReinforcement Learning	CodeCode Available	5
FairStream: Fair Multimedia Streaming Benchmark for Reinforcement Learning Agents	Oct 28, 2024	Fairnessreinforcement-learning	CodeCode Available	5
Device Placement Optimization with Reinforcement Learning	Jun 13, 2017	Language ModelingLanguage Modelling	CodeCode Available	5
Case-Based Inverse Reinforcement Learning Using Temporal Coherence	Jun 12, 2022	Imitation Learningreinforcement-learning	CodeCode Available	5
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems	Feb 20, 2023	Learning-To-RankReinforcement Learning (RL)	CodeCode Available	5
Faults in Deep Reinforcement Learning Programs: A Taxonomy and A Detection Approach	Jan 1, 2021	Deep Reinforcement LearningFault Detection	CodeCode Available	5
Cascaded LSTMs based Deep Reinforcement Learning for Goal-driven Dialogue	Oct 31, 2019	Deep Reinforcement LearningDialogue Management	CodeCode Available	5
Action-Conditional Video Prediction using Deep Networks in Atari Games	Jul 31, 2015	Atari GamesReinforcement Learning	CodeCode Available	5
Skill Decision Transformer	Jan 31, 2023	D4RLDescriptive	CodeCode Available	5
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations	Apr 12, 2019	Deep Reinforcement LearningImitation Learning	CodeCode Available	5
External Model Motivated Agents: Reinforcement Learning for Enhanced Environment Sampling	Jun 28, 2024	reinforcement-learningReinforcement Learning	CodeCode Available	5
An Optical Control Environment for Benchmarking Reinforcement Learning Algorithms	Mar 23, 2022	BenchmarkingDeep Reinforcement Learning	CodeCode Available	5
An Open-source Sim2Real Approach for Sensor-independent Robot Navigation in a Grid	Nov 5, 2024	Autonomous NavigationReinforcement Learning (RL)	CodeCode Available	5
Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks	May 31, 2019	Model-based Reinforcement Learningreinforcement-learning	CodeCode Available	5
Diagnosing Bottlenecks in Deep Q-learning Algorithms	Feb 26, 2019	continuous-controlContinuous Control	CodeCode Available	5
Analysis and Control of a Planar Quadrotor	Jun 29, 2021	Positionreinforcement-learning	CodeCode Available	5
SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation	May 1, 2025	HallucinationNavigate	CodeCode Available	5
Carle's Game: An Open-Ended Challenge in Exploratory Machine Creativity	Jul 13, 2021	Artificial LifeGPU	CodeCode Available	5
Smart Magnetic Microrobots Learn to Swim with Deep Reinforcement Learning	Jan 14, 2022	Deep Reinforcement Learningreinforcement-learning	CodeCode Available	5
A Deep Reinforcement Learning Framework For Column Generation	Jun 3, 2022	Decision MakingDeep Reinforcement Learning	CodeCode Available	5
Extending Environments To Measure Self-Reflection In Reinforcement Learning	Oct 13, 2021	reinforcement-learningReinforcement Learning	CodeCode Available	5
MEDIRL: Predicting the Visual Attention of Drivers via Maximum Entropy Deep Inverse Reinforcement Learning	Dec 17, 2019	Autonomous Vehiclesreinforcement-learning	CodeCode Available	5

Show:10 25 50

← PrevPage 161 of 605Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified