Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 14851–14900 of 15113 papers

Title	Date	Tasks	Status
Learning to Play General-Sum Games Against Multiple Boundedly Rational Agents	Jun 10, 2021	Decision MakingMulti-agent Reinforcement Learning	CodeCode Available
Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models	Jul 4, 2024	Common Sense ReasoningReinforcement Learning (RL)	CodeCode Available
Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning	Nov 29, 2022	Offline RLreinforcement-learning	CodeCode Available
Learning to reinforcement learn for Neural Architecture Search	Nov 9, 2019	Meta-LearningNeural Architecture Search	CodeCode Available
Behavior-based Neuroevolutionary Training in Reinforcement Learning	May 17, 2021	Evolutionary Algorithmsreinforcement-learning	CodeCode Available
A Reminder of its Brittleness: Language Reward Shaping May Hinder Learning for Instruction Following Agents	May 26, 2023	Instruction FollowingReinforcement Learning (RL)	CodeCode Available
Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning	Oct 11, 2021	Deep Reinforcement Learningreinforcement-learning	CodeCode Available
Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning	Sep 16, 2019	Motion Planningreinforcement-learning	CodeCode Available
IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit based on Analyses of Interestingness	Jul 18, 2023	Deep Reinforcement LearningReinforcement Learning (RL)	CodeCode Available
Lifelong Reinforcement Learning with Modulating Masks	Dec 21, 2022	Lifelong learningreinforcement-learning	CodeCode Available
Estimating Risk and Uncertainty in Deep Reinforcement Learning	May 23, 2019	Bayesian InferenceDeep Reinforcement Learning	CodeCode Available
Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video	Oct 21, 2019	continuous-controlContinuous Control	CodeCode Available
Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic Methods	Sep 22, 2021	continuous-controlContinuous Control	CodeCode Available
Estimation of Warfarin Dosage with Reinforcement Learning	Sep 15, 2021	Multi-Armed Banditsreinforcement-learning	CodeCode Available
Andes_gym: A Versatile Environment for Deep Reinforcement Learning in Power Systems	Mar 2, 2022	Deep Reinforcement LearningOpenAI Gym	CodeCode Available
Ethical Challenges in Data-Driven Dialogue Systems	Nov 24, 2017	reinforcement-learningReinforcement Learning	CodeCode Available
Deep Reinforcement Learning using Genetic Algorithm for Parameter Optimization	Feb 19, 2019	Deep Reinforcement Learningreinforcement-learning	CodeCode Available
Control Regularization for Reduced Variance Reinforcement Learning	May 14, 2019	continuous-controlContinuous Control	CodeCode Available
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning	Oct 11, 2021	reinforcement-learningReinforcement Learning	CodeCode Available
Learning to reset in target search problems	Mar 14, 2025	reinforcement-learningReinforcement Learning	CodeCode Available
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning	Jul 21, 2022	reinforcement-learningReinforcement Learning (RL)	CodeCode Available
Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning	Jun 10, 2022	FairnessMulti-Objective Reinforcement Learning	CodeCode Available
GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code Generation	Jan 19, 2025	Bug fixingCode Completion	CodeCode Available
Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent Regime	Apr 16, 2025	Reinforcement Learning (RL)	CodeCode Available
Improving the Efficient Neural Architecture Search via Rewarding Modifications	Dec 17, 2020	Neural Architecture Searchreinforcement-learning	CodeCode Available
Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning	Feb 21, 2017	Atari GamesBoard Games	CodeCode Available
Green Simulation Assisted Reinforcement Learning with Model Risk for Biomanufacturing Learning and Control	Jun 17, 2020	Decision MakingModel-based Reinforcement Learning	CodeCode Available
Jet grooming through reinforcement learning	Mar 22, 2019	reinforcement-learningReinforcement Learning	CodeCode Available
An Autonomous Performance Testing Framework using Self-Adaptive Fuzzy Reinforcement Learning	Aug 19, 2019	reinforcement-learningReinforcement Learning	CodeCode Available
Least-Squares Policy Iteration	Dec 4, 2003	Q-Learningreinforcement-learning	CodeCode Available
Learning from Demonstration without Demonstrations	Jun 17, 2021	Reinforcement Learning (RL)	CodeCode Available
Beating Atari with Natural Language Guided Reinforcement Learning	Apr 18, 2017	Atari GamesDeep Reinforcement Learning	CodeCode Available
Control of nonlinear, complex and black-boxed greenhouse system with reinforcement learning	Jul 30, 2019	Q-Learningreinforcement-learning	CodeCode Available
Bayesian Robust Optimization for Imitation Learning	Jul 24, 2020	Imitation Learningreinforcement-learning	CodeCode Available
Improving the Performance of Backward Chained Behavior Trees that use Reinforcement Learning	Dec 27, 2021	reinforcement-learningReinforcement Learning (RL)	CodeCode Available
Evaluating the Paperclip Maximizer: Are RL-Based Language Models More Likely to Pursue Instrumental Goals?	Feb 16, 2025	reinforcement-learningReinforcement Learning	CodeCode Available
Improving thermal state preparation of Sachdev-Ye-Kitaev model with reinforcement learning on quantum hardware	Jan 20, 2025	Reinforcement Learning (RL)	CodeCode Available
Control of Continuous Quantum Systems with Many Degrees of Freedom based on Convergent Reinforcement Learning	Dec 21, 2022	Deep Reinforcement LearningQ-Learning	CodeCode Available
Evaluating the Robustness of Deep Reinforcement Learning for Autonomous Policies in a Multi-agent Urban Driving Environment	Dec 22, 2021	Autonomous DrivingBenchmarking	CodeCode Available
A Reinforcement Learning Framework for Dynamic Mediation Analysis	Jan 31, 2023	reinforcement-learningReinforcement Learning	CodeCode Available
Improving the sample-efficiency of neural architecture search with reinforcement learning	Oct 13, 2021	AutoMLDeep Learning	CodeCode Available
A Reinforcement Learning Approach to Sensing Design in Resource-Constrained Wireless Networked Control Systems	Apr 1, 2022	Decision MakingReinforcement Learning (RL)	CodeCode Available
Improving Unsupervised Hierarchical Representation with Reinforcement Learning	Jan 1, 2024	reinforcement-learningReinforcement Learning	CodeCode Available
An Autonomous Non-monolithic Agent with Multi-mode Exploration based on Options Framework	May 2, 2023	Reinforcement Learning (RL)	CodeCode Available
Join Query Optimization with Deep Reinforcement Learning Algorithms	Nov 26, 2019	AttributeDeep Reinforcement Learning	CodeCode Available
Adaptive Estimator Selection for Off-Policy Evaluation	Feb 18, 2020	Multi-Armed BanditsOff-policy evaluation	CodeCode Available
Controlling Large Language Model with Latent Actions	Mar 27, 2025	CoLALanguage Modeling	CodeCode Available
Gradient Importance Learning for Incomplete Observations	Jul 5, 2021	ImputationMissing Values	CodeCode Available
Learning the Optimal Power Flow: Environment Design Matters	Mar 26, 2024	Reinforcement Learning (RL)	CodeCode Available
Bayesian Optimization with Robust Bayesian Neural Networks	Dec 1, 2016	Bayesian OptimizationDeep Reinforcement Learning	CodeCode Available

Show:10 25 50

← PrevPage 298 of 303Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified