SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 81518175 of 15113 papers

TitleStatusHype
Parameter Sharing Reinforcement Learning Architecture for Multi Agent Driving Behaviors0
Parameter Sharing with Network Pruning for Scalable Multi-Agent Deep Reinforcement Learning0
Paraphrase Generation with Deep Reinforcement Learning0
Parental Guidance: Efficient Lifelong Learning through Evolutionary Distillation0
Parenting: Safe Reinforcement Learning from Human Input0
Pareto Deterministic Policy Gradients and Its Application in 5G Massive MIMO Networks0
Pareto Frontier Approximation Network (PA-Net) to Solve Bi-objective TSP0
Pareto Policy Adaptation0
Pareto Policy Pool for Model-based Offline Reinforcement Learning0
Pareto Set Learning for Multi-Objective Reinforcement Learning0
ParMod: A Parallel and Modular Framework for Learning Non-Markovian Tasks0
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning0
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation0
Parsing Natural Language into Propositional and First-Order Logic with Dual Reinforcement Learning0
Parsing Natural Language into Propositional and First-Order Logic with Dual Reinforcement Learning0
Part-Activated Deep Reinforcement Learning for Action Prediction0
Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing0
Partially Connected Automated Vehicle Cooperative Control Strategy with a Deep Reinforcement Learning Approach0
Partially Detected Intelligent Traffic Signal Control: Environmental Adaptation0
Partially Observable Multi-Agent Reinforcement Learning with Information Sharing0
Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms0
Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning0
Partial Policy-based Reinforcement Learning for Anatomical Landmark Localization in 3D Medical Images0
Partial Simulation for Imitation Learning0
Particle Based Stochastic Policy Optimization0
Show:102550
← PrevPage 327 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified