Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4351–4375 of 15113 papers

Title	Date	Tasks	Status	Score
Understanding the Safety Requirements for Learning-based Power Systems Operations	Oct 11, 2021	BIG-bench Machine LearningDecision Making	CodeCode Available	5
Model-Based End-to-End Learning for WDM Systems With Transceiver Hardware Impairments	Nov 29, 2021	reinforcement-learningReinforcement Learning	CodeCode Available	5
Reinforcement learning based adaptive metaheuristics	Jun 24, 2022	reinforcement-learningReinforcement Learning	CodeCode Available	5
Understanding when Dynamics-Invariant Data Augmentations Benefit Model-Free Reinforcement Learning Updates	Oct 26, 2023	Data Augmentationreinforcement-learning	CodeCode Available	5
Underwater Soft Fin Flapping Motion with Deep Neural Network Based Surrogate Model	Feb 5, 2025	Reinforcement Learning (RL)	CodeCode Available	5
Playing 2048 With Reinforcement Learning	Oct 20, 2021	Playing the Game of 2048Q-Learning	CodeCode Available	5
Online Prototype Alignment for Few-shot Policy Transfer	Jun 12, 2023	Domain AdaptationReinforcement Learning (RL)	CodeCode Available	5
Unified Distributed Environment	May 14, 2022	OpenAI Gymreinforcement-learning	CodeCode Available	5
QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security Game	Jun 27, 2019	OpenAI GymQ-Learning	CodeCode Available	5
Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective	Jun 13, 2023	Learning-To-RankOffline RL	CodeCode Available	5
Meta-Reinforcement Learning for Reliable Communication in THz/VLC Wireless VR Networks	Jan 29, 2021	Meta-LearningMeta Reinforcement Learning	CodeCode Available	5
Lusifer: LLM-based User SImulated Feedback Environment for online Recommender systems	May 22, 2024	Collaborative FilteringRecommendation Systems	CodeCode Available	5
Unified State Representation Learning under Data Augmentation	Sep 12, 2022	Data AugmentationDomain Adaptation	CodeCode Available	5
SAGE: Generating Symbolic Goals for Myopic Models in Deep Reinforcement Learning	Mar 9, 2022	Deep Reinforcement LearningMinecraft	CodeCode Available	5
Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay	Jul 18, 2016	Atari GamesDeep Reinforcement Learning	CodeCode Available	5
Unifying Count-Based Exploration and Intrinsic Motivation	Jun 6, 2016	Atari GamesMontezuma's Revenge	CodeCode Available	5
Unifying Interpretability and Explainability for Alzheimer's Disease Progression Prediction	Jun 11, 2024	Reinforcement Learning (RL)	CodeCode Available	5
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning	Mar 22, 2017	reinforcement-learningReinforcement Learning	CodeCode Available	5
Playing Atari with Six Neurons	Jun 4, 2018	Atari GamesDecision Making	CodeCode Available	5
Playing Doom with SLAM-Augmented Deep Reinforcement Learning	Dec 1, 2016	Deep Reinforcement Learningobject-detection	CodeCode Available	5
Model-based Lifelong Reinforcement Learning with Bayesian Exploration	Oct 20, 2022	modelreinforcement-learning	CodeCode Available	5
Universally Expressive Communication in Multi-Agent Reinforcement Learning	Jun 14, 2022	Graph LearningMulti-agent Reinforcement Learning	CodeCode Available	5
Universal Policies to Learn Them All	Aug 24, 2019	AllMulti-agent Reinforcement Learning	CodeCode Available	5
Universal Reinforcement Learning Algorithms: Survey and Experiments	May 30, 2017	reinforcement-learningReinforcement Learning	CodeCode Available	5
Universal Successor Features Approximators	Dec 18, 2018	NavigateReinforcement Learning	CodeCode Available	5

Show:10 25 50

← PrevPage 175 of 605Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified