SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 80018025 of 15113 papers

TitleStatusHype
A Deep Reinforcement Learning Approach for Audio-based Navigation and Audio Source Localization in Multi-speaker Environments0
Operator Shifting for Model-based Policy Evaluation0
Mixture-of-Variational-Experts for Continual LearningCode0
Which Model to Trust: Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms for Continuous Control TasksCode0
Self-Consistent Models and Values0
Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning0
Deep Reinforcement Learning for Simultaneous Sensing and Channel Access in Cognitive Networks0
Foresight of Graph Reinforcement Learning Latent Permutations Learnt by Gumbel Sinkhorn Network0
Fully Distributed Actor-Critic Architecture for Multitask Deep Reinforcement Learning0
Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits0
Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy RL0
Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction0
ReLAX: Reinforcement Learning Agent eXplainer for Arbitrary Predictive ModelsCode0
Reinforcement Learning for Process Control with Application in Semiconductor Manufacturing0
Patient level simulation and reinforcement learning to discover novel strategies for treating ovarian cancer0
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming0
A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow0
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks0
Is High Variance Unavoidable in RL? A Case Study in Continuous Control0
Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information0
Can Q-learning solve Multi Armed Bantids?0
Anti-Concentrated Confidence Bonuses for Scalable Exploration0
Deep Reinforcement Learning for Online Control of Stochastic Partial Differential Equations0
Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain0
Neuro-Symbolic Reinforcement Learning with First-Order Logic0
Show:102550
← PrevPage 321 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified