SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 36763700 of 15113 papers

TitleStatusHype
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement LearningCode0
CLUTR: Curriculum Learning via Unsupervised Task Representation LearningCode0
Recurrent Sum-Product-Max Networks for Decision Making in Perfectly-Observed EnvironmentsCode0
Generalization and Regularization in DQNCode0
Generalization in Visual Reinforcement Learning with the Reward Sequence DistributionCode0
Autonomous Option Invention for Continual Hierarchical Reinforcement Learning and PlanningCode0
Generalized Speedy Q-learningCode0
Kernel-Based Reinforcement Learning: A Finite-Time AnalysisCode0
Deep Reinforcement Learning for Optimal Stopping with Application in Financial EngineeringCode0
Regret Minimization for Partially Observable Deep Reinforcement LearningCode0
Regret Minimization for Reinforcement Learning with Vectorial Feedback and Complex ObjectivesCode0
Generalised Discount Functions applied to a Monte-Carlo AImu ImplementationCode0
Gaussian Processes for Data-Efficient Learning in Robotics and ControlCode0
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement LearningCode0
Regularizing Neural Networks by Penalizing Confident Output DistributionsCode0
GAN Q-learningCode0
Applying Deep Reinforcement Learning to the HP Model for Protein Structure PredictionCode0
Deep Reinforcement Learning for Playing 2.5D Fighting GamesCode0
Deep Reinforcement Learning for Long-Short Portfolio OptimizationCode0
Gap-Dependent Unsupervised Exploration for Reinforcement LearningCode0
Generalizable Resource Allocation in Stream Processing via Deep Reinforcement LearningCode0
Cloud Database Tuning with Reinforcement LearningCode0
Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning ProgramsCode0
Reinforcement and Imitation Learning for Diverse Visuomotor SkillsCode0
GAC: A Deep Reinforcement Learning Model Toward User Incentivization in Unknown Social NetworksCode0
Show:102550
← PrevPage 148 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified