SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 37013725 of 15113 papers

TitleStatusHype
Generalization in Reinforcement Learning with Selective Noise Injection and Information BottleneckCode0
Deep Reinforcement Learning for Programming Language CorrectionCode0
Generalised Discount Functions applied to a Monte-Carlo AImu ImplementationCode0
Applying Deep Reinforcement Learning to the HP Model for Protein Structure PredictionCode0
Generalization in Text-based Games via Hierarchical Reinforcement LearningCode0
Cloud Database Tuning with Reinforcement LearningCode0
Gap-Dependent Unsupervised Exploration for Reinforcement LearningCode0
GAN Q-learningCode0
GAC: A Deep Reinforcement Learning Model Toward User Incentivization in Unknown Social NetworksCode0
Entropy Regularized Reinforcement Learning Using Large Deviation TheoryCode0
Accelerate Reinforcement Learning with PID Controllers in the Pendulum SimulationsCode0
Gaussian Processes for Data-Efficient Learning in Robotics and ControlCode0
Generalization in Visual Reinforcement Learning with the Reward Sequence DistributionCode0
A Dual Reinforcement Learning Framework for Unsupervised Text Style TransferCode0
Autonomous Soft Tissue Retraction Using Demonstration-Guided Reinforcement LearningCode0
Reinforcement Learning Decoders for Fault-Tolerant Quantum ComputationCode0
Clipped-Objective Policy Gradients for Pessimistic Policy OptimizationCode0
Application of Self-Play Reinforcement Learning to a Four-Player Game of Imperfect InformationCode0
Climate Adaptation with Reinforcement Learning: Experiments with Flooding and Transportation in CopenhagenCode0
Fully Parameterized Quantile Function for Distributional Reinforcement LearningCode0
Client Selection for Federated Policy Optimization with Environment HeterogeneityCode0
Deep Reinforcement Learning for Sepsis TreatmentCode0
Fully Convolutional Network with Multi-Step Reinforcement Learning for Image ProcessingCode0
Functional Acceleration for Policy Mirror DescentCode0
Clickbait? Sensational Headline Generation with Auto-tuned Reinforcement LearningCode0
Show:102550
← PrevPage 149 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified