SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 83018350 of 15113 papers

TitleStatusHype
Plan Your Target and Learn Your Skills: State-Only Imitation Learning via Decoupled Policy Optimization0
Plasticity Loss in Deep Reinforcement Learning: A Survey0
Playing 20 Question Game with Policy-Based Reinforcement Learning0
Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning0
Playing a Strategy Game with Knowledge-Based Reinforcement Learning0
Playing Atari Ball Games with Hierarchical Reinforcement Learning0
Playing Atari with Capsule Networks: A systematic comparison of CNN and CapsNets-based agents.0
Playing Catan with Cross-dimensional Neural Network0
Playing Flappy Bird via Asynchronous Advantage Actor Critic Algorithm0
Playing Go without Game Tree Search Using Convolutional Neural Networks0
Playing optical tweezers with deep reinforcement learning: in virtual, physical and augmented environments0
Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP0
Playtesting: What is Beyond Personas0
Play with Emotion: Affect-Driven Reinforcement Learning0
PlotThread: Creating Expressive Storyline Visualizations using Reinforcement Learning0
Plug and Play, Model-Based Reinforcement Learning0
PoBRL: Optimizing Multi-Document Summarization by Blending Reinforcement Learning Policies0
PODS: Policy Optimization via Differentiable Simulation0
Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial Observability in Visual Navigation0
Pointer Networks with Q-Learning for Combinatorial Optimization0
PointPatchRL -- Masked Reconstruction Improves Reinforcement Learning on Point Clouds0
Poisoning Deep Reinforcement Learning Agents with In-Distribution Triggers0
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone0
Policy and Value Transfer in Lifelong Reinforcement Learning0
Policy-Based Bayesian Experimental Design for Non-Differentiable Implicit Models0
Policy-Based Radiative Transfer: Solving the 2-Level Atom Non-LTE Problem using Soft Actor-Critic Reinforcement Learning0
Policy-Based Trajectory Clustering in Offline Reinforcement Learning0
Policy Certificates: Towards Accountable Reinforcement Learning0
PolicyCleanse: Backdoor Detection and Mitigation for Competitive Reinforcement Learning0
PolicyClusterGCN: Identifying Efficient Clusters for Training Graph Convolutional Networks0
Policy Distillation and Value Matching in Multiagent Reinforcement Learning0
Policy Distillation with Selective Input Gradient Regularization for Efficient Interpretability0
Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning0
Policy Entropy for Out-of-Distribution Classification0
Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models0
Policy Evaluation and Seeking for Multi-Agent Reinforcement Learning via Best Response0
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning0
Policy-focused Agent-based Modeling using RL Behavioral Models0
Policy Fusion for Adaptive and Customizable Reinforcement Learning Agents0
Policy Generalization In Capacity-Limited Reinforcement Learning0
PolicyGNN: Aggregation Optimization for Graph Neural Networks0
Policy-Gradient Algorithms Have No Guarantees of Convergence in Linear Quadratic Games0
Policy Gradient Algorithms with Monte Carlo Tree Learning for Non-Markov Decision Processes0
Policy Gradient based Quantum Approximate Optimization Algorithm0
Policy Gradient Coagent Networks0
Policy Gradient For Multidimensional Action Spaces: Action Sampling and Entropy Bonus0
Policy Gradient for Reinforcement Learning with General Utilities0
Policy Gradient Method For Robust Reinforcement Learning0
Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines0
Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence0
Show:102550
← PrevPage 167 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified