SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 29512975 of 15113 papers

TitleStatusHype
A Fast Convergence Theory for Offline Decision Making0
ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search0
AACC: Asymmetric Actor-Critic in Contextual Reinforcement Learning0
Decoupled Learning of Environment Characteristics for Safe Exploration0
A Theory of Abstraction in Reinforcement Learning0
A Theoretical Connection Between Statistical Physics and Reinforcement Learning0
A Hybrid Approach Between Adversarial Generative Networks and Actor-Critic Policy Gradient for Low Rate High-Resolution Image Compression0
A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes0
A Human Mixed Strategy Approach to Deep Reinforcement Learning0
Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Uncertain Nonlinear Systems0
A Tensor Network Approach to Finite Markov Decision Processes0
A Temporal-Pattern Backdoor Attack to Deep Reinforcement Learning0
A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors0
A Temporal Difference Reinforcement Learning Theory of Emotion: unifying emotion, cognition and adaptive behavior0
A Human-Centered Data-Driven Planner-Actor-Critic Architecture via Logic Programming0
Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations0
ACDER: Augmented Curiosity-Driven Experience Replay0
Decorrelated Soft Actor-Critic for Efficient Deep Reinforcement Learning0
Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated Exploration0
Decoupling Strategy and Surface Realization for Task-oriented Dialogues0
A Technique to Create Weaker Abstract Board Game Agents via Reinforcement Learning0
A Technical Study into Small Reasoning Language Models0
A Homogenization Approach for Gradient-Dominated Stochastic Optimization0
A Teacher-Student Framework for Maintainable Dialog Manager0
A Taxonomy of Similarity Metrics for Markov Decision Processes0
Show:102550
← PrevPage 119 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified