SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1010110125 of 15113 papers

TitleStatusHype
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial0
Motion Prediction on Self-driving Cars: A Review0
Sample-efficient Reinforcement Learning in Robotic Table Tennis0
The Value Equivalence Principle for Model-Based Reinforcement Learning0
Playing optical tweezers with deep reinforcement learning: in virtual, physical and augmented environments0
Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping0
LBGP: Learning Based Goal Planning for Autonomous Following in Front0
A Hysteretic Q-learning Coordination Framework for Emerging Mobility Systems in Smart Cities0
Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks0
Generative Inverse Deep Reinforcement Learning for Online Recommendation0
XCSF for Automatic Test Case PrioritizationCode0
Optimal Control-Based Baseline for Guided Exploration in Policy Gradient Methods0
Online Observer-Based Inverse Reinforcement Learning0
Control with adaptive Q-learningCode0
Distributional Reinforcement Learning for mmWave Communications with Intelligent Reflectors on a UAV0
Deep Reinforcement Learning Based Dynamic Route Planning for Minimizing Travel Time0
Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning0
Incorporating Rivalry in Reinforcement Learning for a Competitive Game0
Causal Campbell-Goodhart's law and Reinforcement LearningCode0
Cooperative Heterogeneous Deep Reinforcement Learning0
A Variant of the Wang-Foster-Kakade Lower Bound for the Discounted Setting0
Exact Asymptotics for Linear Quadratic Adaptive ControlCode0
Fast Reinforcement Learning with Incremental Gaussian Mixture Models0
Depth Self-Optimized Learning Toward Data ScienceCode0
Information-theoretic Task Selection for Meta-Reinforcement Learning0
Show:102550
← PrevPage 405 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified