SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 90269050 of 15113 papers

TitleStatusHype
Posterior Sampling for Large Scale Reinforcement Learning0
Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information0
Posterior sampling for reinforcement learning: worst-case regret bounds0
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation0
Posterior Variance Analysis of Gaussian Processes with Application to Average Learning Curves0
Post-processing Networks: A Method for Optimizing Pipeline Task-oriented Dialogue Systems using Reinforcement Learning0
Potential-based Credit Assignment for Cooperative RL-based Testing of Autonomous Vehicles0
Potential Field Guided Actor-Critic Reinforcement Learning0
Potential Impacts of Smart Homes on Human Behavior: A Reinforcement Learning Approach0
Powderworld: A Platform for Understanding Generalization via Rich Task Distributions0
Power Allocation for Delay Optimization in Device-to-Device Networks: A Graph Reinforcement Learning Approach0
Power Allocation in Cache-Aided NOMA Systems: Optimization and Deep Reinforcement Learning Approaches0
Power and accountability in reinforcement learning applications to environmental policy0
Power and Accountability in RL-driven Environmental Policy0
Power and Interference Control for VLC-Based UDN: A Reinforcement Learning Approach0
Power Control for Wireless VBR Video Streaming: From Optimization to Reinforcement Learning0
Power Grid Cascading Failure Mitigation by Reinforcement Learning0
PowerNet: Multi-agent Deep Reinforcement Learning for Scalable Powergrid Control0
PowRL: A Reinforcement Learning Framework for Robust Management of Power Networks0
PPO-UE: Proximal Policy Optimization via Uncertainty-Aware Exploration0
Practical and efficient quantum circuit synthesis and transpiling with Reinforcement Learning0
Practical Kernel-Based Reinforcement Learning0
Practical Marginalized Importance Sampling with the Successor Representation0
Practical Risk Measures in Reinforcement Learning0
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models0
Show:102550
← PrevPage 362 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified