SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 56765700 of 15113 papers

TitleStatusHype
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms0
Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning0
Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems0
Sample Complexity of Kernel-Based Q-Learning0
Sample Complexity of Multi-task Reinforcement Learning0
Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds0
Sample Complexity of Offline Distributionally Robust Linear Markov Decision Processes0
Sample Complexity of Policy Gradient Finding Second-Order Stationary Points0
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles0
Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning0
Sample Efficiency in Sparse Reinforcement Learning: Or Your Money Back0
Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management0
Sample-efficient Adversarial Imitation Learning from Observation0
Sample-Efficient and Safe Deep Reinforcement Learning via Reset Deep Ensemble Agents0
Curriculum Reinforcement Learning for Complex Reward Functions0
Sample Efficient Deep Reinforcement Learning for Dialogue Systems with Large Action Spaces0
Sample-efficient Deep Reinforcement Learning for Dialog Control0
Sample Efficient Deep Reinforcement Learning via Local Planning0
Sample-efficient Deep Reinforcement Learning with Imaginary Rollouts for Human-Robot Interaction0
Sample-Efficient, Exploration-Based Policy Optimisation for Routing Problems0
Sample Efficient Feature Selection for Factored MDPs0
Physics-informed Imitative Reinforcement Learning for Real-world Driving0
Sample-Efficient Learning of Nonprehensile Manipulation Policies via Physics-Based Informed State Distributions0
Sample-Efficient Multi-Agent Reinforcement Learning with Demonstrations for Flocking Control0
Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks0
Show:102550
← PrevPage 228 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified