SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 81518175 of 15113 papers

TitleStatusHype
Goal Misgeneralization in Deep Reinforcement LearningCode1
Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model0
Risk-Aware Transfer in Reinforcement Learning using Successor Features0
Optimistic Reinforcement Learning by Forward Kullback-Leibler Divergence Optimization0
Pattern Transfer Learning for Reinforcement Learning in Order Dispatching0
Branching Dueling Q-Network Based Online Scheduling of a Microgrid With Distributed Energy Storage Systems0
AndroidEnv: A Reinforcement Learning Platform for AndroidCode2
A Modular and Transferable Reinforcement Learning Framework for the Fleet Rebalancing Problem0
Adversarial Intrinsic Motivation for Reinforcement LearningCode0
Context-aware taxi dispatching at city-scale using deep reinforcement learning0
Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement LearningCode0
Trajectory Modeling via Random Utility Inverse Reinforcement Learning0
Unbiased Asymmetric Reinforcement Learning under Partial Observability0
Safe Model-based Off-policy Reinforcement Learning for Eco-Driving in Connected and Automated Hybrid Electric Vehicles0
Robust Value Iteration for Continuous Control TasksCode1
Towards Scalable Verification of Deep Reinforcement LearningCode0
Transfer Learning and Curriculum Learning in Sokoban0
A Generalised Inverse Reinforcement Learning Framework0
A Comparison of Reward Functions in Q-Learning Applied to a Cart Position ProblemCode0
Bayesian Nonparametric Reinforcement Learning in LTE and Wi-Fi Coexistence0
KnowSR: Knowledge Sharing among Homogeneous Agents in Multi-agent Reinforcement Learning0
Interpretable UAV Collision Avoidance using Deep Reinforcement Learning0
IGO-QNN: Quantum Neural Network Architecture for Inductive Grover Oracularization0
FNAS: Uncertainty-Aware Fast Neural Architecture Search0
Verification of Dissipativity and Evaluation of Storage Function in Economic Nonlinear MPC using Q-Learning0
Show:102550
← PrevPage 327 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified