SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 72767300 of 15113 papers

TitleStatusHype
Combinatorial Reinforcement Learning Based Scheduling for DNN Execution on Edge0
Hypothesis Driven Coordinate Ascent for Reinforcement Learning0
Adversarial Style Transfer for Robust Policy Optimization in Reinforcement Learning0
Distributional Perturbation for Efficient Exploration in Distributional Reinforcement Learning0
Fourier Features in Reinforcement Learning with Neural NetworksCode0
AARL: Automated Auxiliary Loss for Reinforcement Learning0
Multi-Agent Reinforcement Learning with Shared Resource in Inventory Management0
Rewardless Open-Ended Learning (ROEL)0
The guide and the explorer: smart agents for resource-limited iterated batch reinforcement learning0
Semi-supervised Offline Reinforcement Learning with Pre-trained Decision Transformers0
Offline Reinforcement Learning with Resource Constrained Online Deployment0
Offline Reinforcement Learning with In-sample Q-LearningCode1
Pretraining for Language Conditioned Imitation with Transformers0
Reasoning With Hierarchical Symbols: Reclaiming Symbolic Policies For Visual Reinforcement Learning0
PDQN - A Deep Reinforcement Learning Method for Planning with Long Delays: Optimization of Manufacturing Dispatching0
Theoretical understanding of adversarial reinforcement learning via mean-field optimal control0
Pareto Policy Adaptation0
SPP-RL: State Planning Policy Reinforcement Learning0
Reinforcement Learning State Estimation for High-Dimensional Nonlinear Systems0
Towards Understanding Distributional Reinforcement Learning: Regularization, Optimization, Acceleration and Sinkhorn Algorithm0
^2-exploration for Reinforcement Learning0
MOBA: Multi-teacher Model Based Reinforcement Learning0
Rethinking Pareto Approaches in Constrained Reinforcement Learning0
Reinforcement Learning with Ex-Post Max-Min Fairness0
Weakly-Supervised Learning of Disentangled and Interpretable Skills for Hierarchical Reinforcement Learning0
Show:102550
← PrevPage 292 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified