SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 82018250 of 15113 papers

TitleStatusHype
Pedestrian Prediction by Planning using Deep Neural Networks0
Penalized Proximal Policy Optimization for Safe Reinforcement Learning0
PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making0
Perception and Navigation in Autonomous Systems in the Era of Learning: A Survey0
Perception-Prediction-Reaction Agents for Deep Reinforcement Learning0
Perceptual Reward Functions0
Perceptual Values from Observation0
Performance Bounds for Policy-Based Average Reward Reinforcement Learning Algorithms0
Performance-Driven Controller Tuning via Derivative-Free Reinforcement Learning0
Performance Dynamics and Termination Errors in Reinforcement Learning: A Unifying Perspective0
Performance Optimization for Semantic Communications: An Attention-based Reinforcement Learning Approach0
Performance Optimization for Variable Bitwidth Federated Learning in Wireless Networks0
Performance-Weighed Policy Sampling for Meta-Reinforcement Learning0
Performative Reinforcement Learning0
PERIL: Probabilistic Embeddings for hybrid Meta-Reinforcement and Imitation Learning0
Periodic agent-state based Q-learning for POMDPs0
PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning0
PerSim: Data-Efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators0
Autonomous Reinforcement Learning via Subgoal Curricula0
Persistent Rule-based Interactive Reinforcement Learning0
Towards Personalization of User Preferences in Partially Observable Smart Home Environments0
Personalisation via Dynamic Policy Fusion0
Personalization for Web-based Services using Offline Reinforcement Learning0
A clustering-based reinforcement learning approach for tailored personalization of e-Health interventions0
Personalization of Hearing Aid Compression by Human-In-Loop Deep Reinforcement Learning0
Preference Adaptive and Sequential Text-to-Image Generation0
Personalized Cancer Chemotherapy Schedule: a numerical comparison of performance and robustness in model-based and model-free scheduling methodologies0
Personalized Dynamic Pricing Policy for Electric Vehicles: Reinforcement learning approach0
Personalized Education at Scale0
Personalized Exposure Control Using Adaptive Metering and Reinforcement Learning0
Personalized Federated Hypernetworks for Privacy Preservation in Multi-Task Reinforcement Learning0
Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity0
Personalized Lane Change Decision Algorithm Using Deep Reinforcement Learning Approach0
Personalized Medical Treatments Using Novel Reinforcement Learning Algorithms0
Personalizing a Dialogue System with Transfer Reinforcement Learning0
Perspectives on the Social Impacts of Reinforcement Learning with Human Feedback0
Perspective Taking in Deep Reinforcement Learning Agents0
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning0
Persuading to Prepare for Quitting Smoking with a Virtual Coach: Using States and User Characteristics to Predict Behavior0
Perturbational Complexity by Distribution Mismatch: A Systematic Analysis of Reinforcement Learning in Reproducing Kernel Hilbert Space0
Perturbation-based exploration methods in deep reinforcement learning0
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes0
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning0
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning0
Pessimistic Causal Reinforcement Learning with Mediators for Confounded Offline Data0
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage0
Pessimistic Model Selection for Offline Deep Reinforcement Learning0
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning0
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity0
Petri Net Machines for Human-Agent Interaction0
Show:102550
← PrevPage 165 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified