SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 80268050 of 15113 papers

TitleStatusHype
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning0
MURO: Deployment Constrained Reinforcement Learning with Model-based Uncertainty Regularized Batch Optimization0
MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch Optimization for Deployment Constrained Reinforcement Learning0
MUST: A Framework for Training Task-oriented Dialogue Systems with Multiple User SimulaTors0
Muti-Agent Proximal Policy Optimization For Data Freshness in UAV-assisted Networks0
Mutual Enhancement of Large Language and Reinforcement Learning Models through Bi-Directional Feedback Mechanisms: A Case Study0
Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling0
Mutual Information-based State-Control for Intrinsically Motivated Reinforcement Learning0
Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning0
Mutual Reinforcement Learning0
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search0
N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning0
NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning0
NANCY: Neural Adaptive Network Coding methodologY for video distribution over wireless networks0
Resource Allocation in Disaggregated Data Centre Systems with Reinforcement Learning0
NaRLE: Natural Language Models using Reinforcement Learning with Emotion Feedback0
Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator0
Natural Gradient Deep Q-learning0
Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation0
Natural Language Generation as Planning under Uncertainty Using Reinforcement Learning0
Natural Language Person Search Using Deep Reinforcement Learning0
Natural Language Reinforcement Learning0
Language is Power: Representing States Using Natural Language in Reinforcement Learning0
Natural Policy Gradient and Actor Critic Methods for Constrained Multi-Task Reinforcement Learning0
Natural Policy Gradient for Average Reward Non-Stationary RL0
Show:102550
← PrevPage 322 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified