SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1145111500 of 15113 papers

TitleStatusHype
Visual Tracking by means of Deep Reinforcement Learning and an Expert Demonstrator0
Visuomotor Mechanical Search: Learning to Retrieve Target Objects in Clutter0
ViVa: Video-Trained Value Functions for Guiding Online RL from Diverse Data0
Vizarel: A System to Help Better Understand RL Agents0
VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning0
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making0
VLM-RL: A Unified Vision Language Models and Reinforcement Learning Framework for Safe Autonomous Driving0
VLP: Vision-Language Preference Learning for Embodied Manipulation0
VL-SAFE: Vision-Language Guided Safety-Aware Reinforcement Learning with World Models for Autonomous Driving0
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for Model-based Control0
vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of Gradient Directions for Policy Improvement0
VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play0
VOQL: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation0
Voting-Based Multi-Agent Reinforcement Learning for Intelligent IoT0
VPE: Variational Policy Embedding for Transfer Reinforcement Learning0
VRAIL: Vectorized Reward-based Attribution for Interpretable Learning0
VRLS: A Unified Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications0
Advancing Autonomous VLM Agents via Variational Subgoal-Conditioned Reinforcement Learning0
Vulcan: Solving the Steiner Tree Problem with Graph Neural Networks and Deep Reinforcement Learning0
Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics0
WAD: A Deep Reinforcement Learning Agent for Urban Autonomous Driving0
Wall Street Tree Search: Risk-Aware Planning for Offline Reinforcement Learning0
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap0
Warmth and competence in human-agent cooperation0
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes0
Warren at SemEval-2020 Task 4: ALBERT and Multi-Task Learning for Commonsense Validation0
Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control0
Wasserstein Adversarial Imitation Learning0
Wasserstein Dependency Measure for Representation Learning0
Wasserstein Robust Reinforcement Learning0
Wasserstein Unsupervised Reinforcement Learning0
Watch from sky: machine-learning-based multi-UAV network for predictive police surveillance0
Stop-and-Go: Exploring Backdoor Attacks on Deep Reinforcement Learning-based Traffic Congestion Control Systems0
WaveCorr: Deep Reinforcement Learning with Permutation Invariant Policy Networks for Portfolio Management0
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog0
Way Off-Policy Batch Deep Reinforcement Learning of Human Preferences in Dialog0
On L_2-consistency of nearest neighbor matching0
Weakly Supervised Disentangled Representation for Goal-conditioned Reinforcement Learning0
Weakly-Supervised Learning of Disentangled and Interpretable Skills for Hierarchical Reinforcement Learning0
Weakly-Supervised Reinforcement Learning for Controllable Behavior0
Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning0
Weakness Analysis of Cyberspace Configuration Based on Reinforcement Learning0
Weber-Fechner Law in Temporal Difference learning derived from Control as Inference0
WebWISE: Web Interface Control and Sequential Exploration with Large Language Models0
Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates0
Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments0
Weighted Entropy Modification for Soft Actor-Critic0
Weighted Likelihood Policy Search with Model Selection0
Weighted Maximum Entropy Inverse Reinforcement Learning0
Weighted model estimation for offline model-based reinforcement learning0
Show:102550
← PrevPage 230 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified