SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1090110925 of 15113 papers

TitleStatusHype
Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval0
AutoHAS: Efficient Hyperparameter and Architecture Search0
State Action Separable Reinforcement Learning0
Refined Continuous Control of DDPG Actors via Parametrised Activation0
Visual Transfer for Reinforcement Learning via Wasserstein Domain ConfusionCode0
Meta-Model-Based Meta-Policy Optimization0
Constrained Reinforcement Learning for Dynamic Optimization under Uncertainty0
A Novel Update Mechanism for Q-Networks Based On Extreme Learning MachinesCode0
Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains0
Learning to Scan: A Deep Reinforcement Learning Approach for Personalized Scanning in CT Imaging0
The Value-Improvement Path: Towards Better Representations for Reinforcement Learning0
Temporally-Extended ε-Greedy ExplorationCode0
Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient ExplorationCode0
Jointly Learning Environments and Control Policies with Projected Stochastic Gradient AscentCode0
Active Vision for Early Recognition of Human Actions0
A novel approach for multi-agent cooperative pursuit to capture grouped evaders0
Reinforcement learning and Bayesian data assimilation for model-informed precision dosing in oncology0
Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning0
Temporal-Differential Learning in Continuous Environments0
Robust Reinforcement Learning with Wasserstein Constraint0
Model-Based Reinforcement Learning with Value-Targeted Regression0
Variational Reward Estimator Bottleneck: Learning Robust Reward Estimator for Multi-Domain Task-Oriented Dialog0
MM-KTD: Multiple Model Kalman Temporal Differences for Reinforcement LearningCode0
Reinforcement LearningCode0
AI-based Resource Allocation: Reinforcement Learning for Adaptive Auto-scaling in Serverless Environments0
Show:102550
← PrevPage 437 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified