SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 98769900 of 15113 papers

TitleStatusHype
More Efficient Off-Policy Evaluation through Regularized Targeted Learning0
(More) Efficient Reinforcement Learning via Posterior Sampling0
MOReL: Model-Based Offline Reinforcement Learning0
More Robust Doubly Robust Off-policy Evaluation0
MoRE: Unlocking Scalability in Reinforcement Learning for Quadruped Vision-Language-Action Models0
MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading0
MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding0
Motion Perception in Reinforcement Learning with Dynamic Objects0
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments0
Motion Planning by Reinforcement Learning for an Unmanned Aerial Vehicle in Virtual Open Space with Static Obstacles0
Motion Planning for Autonomous Vehicles in the Presence of Uncertainty Using Reinforcement Learning0
Motion Prediction on Self-driving Cars: A Review0
MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning0
Motivating Physical Activity via Competitive Human-Robot Interaction0
MP3: Movement Primitive-Based (Re-)Planning Policy0
MPC4RL -- A Software Package for Reinforcement Learning based on Model Predictive Control0
MPC-based Reinforcement Learning for a Simplified Freight Mission of Autonomous Surface Vehicles0
MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage0
MQES: Max-Q Entropy Search for Efficient Exploration in Continuous Reinforcement Learning0
MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server0
MRAC-RL: A Framework for On-Line Policy Adaptation Under Parametric Model Uncertainty0
MSDF: A Deep Reinforcement Learning Framework for Service Function Chain Migration0
MS-Ranker: Accumulating Evidence from Potentially Correct Candidates for Answer Selection0
MSRL: Distributed Reinforcement Learning with Dataflow Fragments0
MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation0
Show:102550
← PrevPage 396 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified