Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6951–6975 of 15113 papers

Title	Date	Tasks	Status
Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs	Nov 24, 2020	Out-of-Distribution Detectionreinforcement-learning	—Unverified
Uncertainty Estimation for Language Reward Models	Mar 14, 2022	Active LearningReinforcement Learning (RL)	—Unverified
Uncertainty quantification for Markov chains with application to temporal difference learning	Feb 19, 2025	reinforcement-learningReinforcement Learning	—Unverified
Uncertainty Regularized Policy Learning for Offline Reinforcement Learning	Sep 29, 2021	D4RLOffline RL	—Unverified
Uncertainty Weighted Offline Reinforcement Learning	Jan 1, 2021	Offline RLQ-Learning	—Unverified
Uncovering Surprising Behaviors in Reinforcement Learning via Worst-case Analysis	May 1, 2019	Navigatereinforcement-learning	—Unverified
Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings	Feb 26, 2019	Reinforcement LearningReinforcement Learning (RL)	—Unverified
Understanding and Leveraging Overparameterization in Recursive Value Estimation	Sep 29, 2021	Reinforcement Learning (RL)Value prediction	—Unverified
Understanding and Leveraging Causal Relations in Deep Reinforcement Learning	Jan 1, 2021	Decision MakingDeep Reinforcement Learning	—Unverified
Understanding and Preventing Capacity Loss in Reinforcement Learning	Apr 20, 2022	Montezuma's Revengereinforcement-learning	—Unverified
Understanding and Shifting Preferences for Battery Electric Vehicles	Feb 9, 2022	Reinforcement Learning (RL)	—Unverified
Understanding and Simplifying One-Shot Architecture Search	Jul 1, 2018	Neural Architecture Searchreinforcement-learning	—Unverified
Optimality theory of stigmergic collective information processing by chemotactic cells	Jul 21, 2024	Reinforcement Learning (RL)	—Unverified
A Look at Value-Based Decision-Time vs. Background Planning Methods Across Different Settings	Jun 16, 2022	Model-based Reinforcement Learningreinforcement-learning	—Unverified
Understanding Deep Neural Function Approximation in Reinforcement Learning via ε-Greedy Exploration	Sep 15, 2022	reinforcement-learningReinforcement Learning (RL)	—Unverified
Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization	Dec 1, 2021	Model-based Reinforcement Learningreinforcement-learning	—Unverified
Understanding & Generalizing AlphaGo Zero	May 1, 2019	Decision Makingreinforcement-learning	—Unverified
Understanding Hindsight Goal Relabeling from a Divergence Minimization Perspective	Sep 26, 2022	Imitation LearningMulti-Goal Reinforcement Learning	—Unverified
The Importance of Online Data: Understanding Preference Fine-tuning via Coverage	Jun 3, 2024	Reinforcement Learning (RL)	—Unverified
Understanding Reinforcement Learning Algorithms: The Progress from Basic Q-learning to Proximal Policy Optimization	Mar 31, 2023	Offline RLQ-Learning	—Unverified
Understanding Self-Predictive Learning for Reinforcement Learning	Dec 6, 2022	reinforcement-learningReinforcement Learning	—Unverified
Understanding the Complexity Gains of Single-Task RL with a Curriculum	Dec 24, 2022	Reinforcement Learning (RL)	—Unverified
Understanding the Generalization Gap in Visual Reinforcement Learning	Sep 29, 2021	Data AugmentationDeep Reinforcement Learning	—Unverified
Understanding the Limits of Poisoning Attacks in Episodic Reinforcement Learning	Aug 29, 2022	reinforcement-learningReinforcement Learning	—Unverified
Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning	Oct 28, 2020	Reinforcement Learning (RL)	—Unverified

Show:10 25 50

← PrevPage 279 of 605Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified