SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 99019925 of 15113 papers

TitleStatusHype
Predictive Synthesis of Quantum Materials by Probabilistic Reinforcement Learning0
Multi-Agent Reinforcement Learning in Cournot Games0
Variance-Reduced Off-Policy Memory-Efficient Policy Search0
VacSIM: Learning Effective Strategies for COVID-19 Vaccine Distribution using Reinforcement LearningCode0
Efficient Competitive Self-Play Policy Optimization0
Guided Policy Search Based Control of a High Dimensional Advanced Manufacturing Process0
Extended Radial Basis Function Controller for Reinforcement Learning0
Deep Learning Interference Cancellation in Wireless Networks0
Reinforcement Learning for Optimal Primary Frequency Control: A Lyapunov ApproachCode1
Semantic-preserving Reinforcement Learning Attack Against Graph Neural Networks for Malware DetectionCode1
Physically Embedded Planning Problems: New Challenges for Reinforcement LearningCode0
Embodied Visual Navigation with Automatic Curriculum Learning in Real Environments0
RLCFR: Minimize Counterfactual Regret by Deep Reinforcement Learning0
TripleTree: A Versatile Interpretable Representation of Black Box Agents and their EnvironmentsCode0
A framework for reinforcement learning with autocorrelated actionsCode0
COVID-19 Pandemic Cyclic Lockdown Optimization Using Reinforcement Learning0
Importance Weighted Policy Learning and Adaptation0
Deep Reinforcement Learning for Option Replication and Hedging0
AoI Minimization in Status Update Control with Energy Harvesting Sensors0
DyNODE: Neural Ordinary Differential Equations for Dynamics Modeling in Continuous ControlCode1
Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement LearningCode1
Multi-Objective Model-based Reinforcement Learning for Infectious Disease Control0
QR-MIX: Distributional Value Function Factorisation for Cooperative Multi-Agent Reinforcement Learning0
Phasic Policy GradientCode1
Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic Mean-Field Games0
Show:102550
← PrevPage 397 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified