SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1505115100 of 15113 papers

TitleStatusHype
APRIL: Active Preference-learning based Reinforcement Learning0
The Arcade Learning Environment: An Evaluation Platform for General AgentsCode0
Reinforcement Learning of Question-Answering Dialogue Policies for Virtual Museum Guides0
Framework of Automatic Text Summarization Using Reinforcement Learning0
Monte Carlo Bayesian Reinforcement Learning0
Policy Gradients with Variance Related Risk Criteria0
Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods0
Off-Policy Actor-CriticCode0
Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search0
Evaluation of Online Dialogue Policy Learning Techniques0
A Comparative Study of Reinforcement Learning Techniques on Dialogue Management0
PAC-Bayesian Policy Evaluation for Reinforcement Learning0
Multi-timescale Nexting in a Reinforcement Learning Robot0
Reinforcement Learning using Kernel-Based Stochastic Factorization0
Transfer from Multiple MDPs0
MAP Inference for Bayesian Inverse Reinforcement Learning0
Selecting the State-Representation in Reinforcement Learning0
Optimal Reinforcement Learning for Gaussian Systems0
Policy Gradient Coagent Networks0
Nonlinear Inverse Reinforcement Learning with Gaussian ProcessesCode0
Action-Gap Phenomenon in Reinforcement Learning0
Blending Autonomous Exploration and Apprenticeship Learning0
A Reinforcement Learning Theory for Homeostatic Regulation0
Clustering via Dirichlet Process Mixture Models for Portable Skill Discovery0
Analysis and Improvement of Policy Gradient Estimation0
Risk-Sensitive Reinforcement Learning Applied to Control under Constraints0
PAC-Bayesian Model Selection for Reinforcement Learning0
Predictive State Temporal Difference Learning0
LSTD with Random Projections0
Nonparametric Bayesian Policy Priors for Reinforcement Learning0
Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories0
Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains0
Feature Construction for Inverse Reinforcement Learning0
Double Q-learning0
Linear Complementarity for Regularized Policy Evaluation and Improvement0
Fast Reinforcement Learning for Energy-Efficient Wireless Communications0
Reinforcement Learning via AIXI Approximation0
Computational Model of Music Sight Reading: A Reinforcement Learning Approach0
Feature Selection as a One-Player Game0
A Generalized Natural Actor-Critic Algorithm0
Discrete MDL Predicts in Total Variation0
Solving Stochastic Games0
Manifold Embeddings for Model-Based Reinforcement Learning under Partial Observability0
Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference0
Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining0
A Monte Carlo AIXI ApproximationCode0
Hebbian Learning of Bayes Optimal Decisions0
Near-optimal Regret Bounds for Reinforcement Learning0
Optimization on a Budget: A Reinforcement Learning Approach0
Policy Search for Motor Primitives in Robotics0
Show:102550
← PrevPage 302 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified