SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 92269250 of 15113 papers

TitleStatusHype
Provably Efficient Lifelong Reinforcement Learning with Linear Function Approximation0
Provably Efficient Model-Free Algorithms for Non-stationary CMDPs0
Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication0
Provably Efficient Multi-Task Reinforcement Learning with Model Transfer0
Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus0
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward0
Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources0
Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints0
Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle0
Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle0
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces0
Provably Efficient Reinforcement Learning with Aggregated States0
Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension0
Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations0
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints0
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping0
Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization0
Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games0
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems0
Provably Efficient Reinforcement Learning via Surprise Bound0
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL0
Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation0
Provably Feedback-Efficient Reinforcement Learning via Active Reward Learning0
Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics0
Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration0
Show:102550
← PrevPage 370 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified