SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 91519175 of 15113 papers

TitleStatusHype
Learning Vehicle Routing Problems using Policy Optimisation0
SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning0
SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II0
Unsupervised deep clustering and reinforcement learning can accurately segment MRI brain tumors with very small training sets0
Rethink AI-based Power Grid Control: Diving Into Algorithm Design0
Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for Portfolio Optimization and Order Execution0
Augmenting Policy Learning with Routines Discovered from a Single DemonstrationCode1
Intelligent Reflecting Surface Assisted Anti-Jamming Communications Based on Reinforcement Learning0
Intelligent Resource Allocation in Dense LoRa Networks using Deep Reinforcement Learning0
A Dynamic Penalty Function Approach for Constraints-Handling in Reinforcement Learning0
Scalable Deep Reinforcement Learning for Routing and Spectrum Access in Physical Layer0
Self-Imitation Advantage Learning0
QVMix and QVMix-Max: Extending the Deep Quality-Value Family of Algorithms to Cooperative Multi-Agent Reinforcement LearningCode0
myGym: Modular Toolkit for Visuomotor Robotic Tasks0
Explicitly Encouraging Low Fractional Dimensional Trajectories Via Reinforcement LearningCode0
Difference Rewards Policy Gradients0
Offline Reinforcement Learning from Images with Latent Space ModelsCode1
Mobile Robot Planner with Low-cost Cameras Using Deep Reinforcement Learning0
Reinforcement Learning-based Product Delivery Frequency Control0
Quantum reinforcement learning in continuous action space0
Minimax Strikes Back0
Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing ProblemsCode1
Model-Based Actor-Critic with Chance Constraint for Stochastic System0
Generalize a Small Pre-trained Model to Arbitrarily Large TSP InstancesCode1
Deep Reinforcement Learning for Joint Spectrum and Power Allocation in Cellular NetworksCode1
Show:102550
← PrevPage 367 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified