SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 86018625 of 15113 papers

TitleStatusHype
Provably Efficient Cooperative Multi-Agent Reinforcement Learning with Function Approximation0
Provably Efficient CVaR RL in Low-rank MDPs0
Provably Efficient Exploration in Policy Optimization0
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret0
Provably Efficient Exploration in Reward Machines with Low Regret0
Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation and Human Feedback0
Provably Efficient Lifelong Reinforcement Learning with Linear Function Approximation0
Provably Efficient Model-Free Algorithms for Non-stationary CMDPs0
Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication0
Provably Efficient Multi-Task Reinforcement Learning with Model Transfer0
Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus0
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward0
Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources0
Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints0
Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle0
Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle0
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces0
Provably Efficient Reinforcement Learning with Aggregated States0
Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension0
Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations0
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints0
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping0
Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization0
Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games0
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems0
Show:102550
← PrevPage 345 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified