SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 81518200 of 15113 papers

TitleStatusHype
Parameter Sharing Reinforcement Learning Architecture for Multi Agent Driving Behaviors0
Parameter Sharing with Network Pruning for Scalable Multi-Agent Deep Reinforcement Learning0
Paraphrase Generation with Deep Reinforcement Learning0
Parental Guidance: Efficient Lifelong Learning through Evolutionary Distillation0
Parenting: Safe Reinforcement Learning from Human Input0
Pareto Deterministic Policy Gradients and Its Application in 5G Massive MIMO Networks0
Pareto Frontier Approximation Network (PA-Net) to Solve Bi-objective TSP0
Pareto Policy Adaptation0
Pareto Policy Pool for Model-based Offline Reinforcement Learning0
Pareto Set Learning for Multi-Objective Reinforcement Learning0
ParMod: A Parallel and Modular Framework for Learning Non-Markovian Tasks0
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning0
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation0
Parsing Natural Language into Propositional and First-Order Logic with Dual Reinforcement Learning0
Parsing Natural Language into Propositional and First-Order Logic with Dual Reinforcement Learning0
Part-Activated Deep Reinforcement Learning for Action Prediction0
Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing0
Partially Connected Automated Vehicle Cooperative Control Strategy with a Deep Reinforcement Learning Approach0
Partially Detected Intelligent Traffic Signal Control: Environmental Adaptation0
Partially Observable Multi-Agent Reinforcement Learning with Information Sharing0
Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms0
Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning0
Partial Policy-based Reinforcement Learning for Anatomical Landmark Localization in 3D Medical Images0
Partial Simulation for Imitation Learning0
Particle Based Stochastic Policy Optimization0
Particle Swarm Optimization for Generating Interpretable Fuzzy Reinforcement Learning Policies0
Particle Value Functions0
Partitioning Distributed Compute Jobs with Reinforcement Learning and Graph Neural Networks0
Partner Approximating Learners (PAL): Simulation-Accelerated Learning with Explicit Partner Modeling in Multi-Agent Domains0
Partner Personas Generation for Dialogue Response Generation0
PassGoodPool: Joint Passengers and Goods Fleet Management with Reinforcement Learning aided Pricing, Matching, and Route Planning0
Passing Through Narrow Gaps with Deep Reinforcement Learning0
Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems0
Path Design and Resource Management for NOMA enhanced Indoor Intelligent Robots0
Pathfinding in Random Partially Observable Environments with Vision-Informed Deep Reinforcement Learning0
Path Following and Stabilisation of a Bicycle Model using a Reinforcement Learning Approach0
Path Integral Networks: End-to-End Differentiable Optimal Control0
Machine learning strategies for path-planning microswimmers in turbulent flows0
Path Planning of Cleaning Robot with Reinforcement Learning0
Path Planning using Reinforcement Learning: A Policy Iteration Approach0
Patient level simulation and reinforcement learning to discover novel strategies for treating ovarian cancer0
Patterns, predictions, and actions: A story about machine learning0
Pattern Transfer Learning for Reinforcement Learning in Order Dispatching0
Pauli Network Circuit Synthesis with Reinforcement Learning0
Paused Agent Replay Refresh0
Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making0
PBCS : Efficient Exploration and Exploitation Using a Synergy between Reinforcement Learning and Motion Planning0
PDQN - A Deep Reinforcement Learning Method for Planning with Long Delays: Optimization of Manufacturing Dispatching0
PEARL: Parallelized Expert-Assisted Reinforcement Learning for Scene Rearrangement Planning0
PEAR: Primitive enabled Adaptive Relabeling for boosting Hierarchical Reinforcement Learning0
Show:102550
← PrevPage 164 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified