SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 77767800 of 15113 papers

TitleStatusHype
Model Mediated Teleoperation with a Hand-Arm Exoskeleton in Long Time Delays Using Reinforcement Learning0
Model-predictive control and reinforcement learning in multi-energy system case studies0
Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming0
Model Predictive Control via On-Policy Imitation Learning0
Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties0
Model-Reference Reinforcement Learning for Collision-Free Tracking Control of Autonomous Surface Vehicles0
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol0
Model Selection in Reinforcement Learning with General Function Approximations0
Model Selection for Generic Reinforcement Learning0
Modified Actor-Critics0
Modifying RL Policies with Imagined Actions: How Predictable Policies Can Enable Users to Perform Novel Tasks0
MODRL/D-AM: Multiobjective Deep Reinforcement Learning Algorithm Using Decomposition and Attention Model for Multiobjective Optimization0
MODRL/D-EL: Multiobjective Deep Reinforcement Learning with Evolutionary Learning for Multiobjective Optimization0
Modular Architecture for StarCraft II with Deep Reinforcement Learning0
Modularity benefits reinforcement learning agents with competing homeostatic drives0
Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment0
Modulated Policy Hierarchies0
Modulating Reservoir Dynamics via Reinforcement Learning for Efficient Robot Skill Synthesis0
MoET: Interpretable and Verifiable Reinforcement Learning via Mixture of Expert Trees0
Molecular Design in Synthetically Accessible Chemical Space via Deep Reinforcement Learning0
Molecular Generative Adversarial Network with Multi-Property Optimization0
Mollification Effects of Policy Gradient Methods0
Momentum in Reinforcement Learning0
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking0
MONAS: Multi-Objective Neural Architecture Search using Reinforcement Learning0
Show:102550
← PrevPage 312 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified