SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 82268250 of 15113 papers

TitleStatusHype
Preference Adaptive and Sequential Text-to-Image Generation0
Personalized Cancer Chemotherapy Schedule: a numerical comparison of performance and robustness in model-based and model-free scheduling methodologies0
Personalized Dynamic Pricing Policy for Electric Vehicles: Reinforcement learning approach0
Personalized Education at Scale0
Personalized Exposure Control Using Adaptive Metering and Reinforcement Learning0
Personalized Federated Hypernetworks for Privacy Preservation in Multi-Task Reinforcement Learning0
Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity0
Personalized Lane Change Decision Algorithm Using Deep Reinforcement Learning Approach0
Personalized Medical Treatments Using Novel Reinforcement Learning Algorithms0
Personalizing a Dialogue System with Transfer Reinforcement Learning0
Perspectives on the Social Impacts of Reinforcement Learning with Human Feedback0
Perspective Taking in Deep Reinforcement Learning Agents0
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning0
Persuading to Prepare for Quitting Smoking with a Virtual Coach: Using States and User Characteristics to Predict Behavior0
Perturbational Complexity by Distribution Mismatch: A Systematic Analysis of Reinforcement Learning in Reproducing Kernel Hilbert Space0
Perturbation-based exploration methods in deep reinforcement learning0
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes0
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning0
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning0
Pessimistic Causal Reinforcement Learning with Mediators for Confounded Offline Data0
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage0
Pessimistic Model Selection for Offline Deep Reinforcement Learning0
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning0
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity0
Petri Net Machines for Human-Agent Interaction0
Show:102550
← PrevPage 330 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified