SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 31113120 of 15113 papers

TitleStatusHype
Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks0
Mode-constrained Model-based Reinforcement Learning via Gaussian ProcessesCode0
Reinforcement Learning -based Adaptation and Scheduling Methods for Multi-source DASHCode0
Offline Reinforcement Learning with On-Policy Q-Function Regularization0
Communication-Efficient Orchestrations for URLLC Service via Hierarchical Reinforcement Learning0
Submodular Reinforcement LearningCode1
Counterfactual Explanation Policies in RL0
Settling the Sample Complexity of Online Reinforcement Learning0
The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation0
Unbiased Weight Maximization0
Show:102550
← PrevPage 312 of 1512Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified