SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 48014825 of 15113 papers

TitleStatusHype
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs0
A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning0
A Primer on Maximum Causal Entropy Inverse Reinforcement Learning0
A Principled Permutation Invariant Approach to Mean-Field Multi-Agent Reinforcement Learning0
A Privacy-preserving Distributed Training Framework for Cooperative Multi-agent Deep Reinforcement Learning0
A Proposal: Interactively Learning to Summarise Timelines by Reinforcement Learning0
A Provable Approach for End-to-End Safe Reinforcement Learning0
A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning0
A Provably Efficient Sample Collection Strategy for Reinforcement Learning0
APT: Adaptive Perceptual quality based camera Tuning using reinforcement learning0
A Quantum States Preparation Method Based on Difference-Driven Reinforcement Learning0
AR3n: A Reinforcement Learning-based Assist-As-Needed Controller for Robotic Rehabilitation0
A random measure approach to reinforcement learning in continuous time0
Arbitrage of Energy Storage in Electricity Markets with Deep Reinforcement Learning0
ARC -- Actor Residual Critic for Adversarial Imitation Learning0
Arcades: A deep model for adaptive decision making in voice controlled smart-home0
Architecting and Visualizing Deep Reinforcement Learning Models0
A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards0
A Real-World Quadrupedal Locomotion Benchmark for Offline Reinforcement Learning0
Area-wide traffic signal control based on a deep graph Q-Network (DGQN) trained in an asynchronous manner0
A Reduction Approach to Constrained Reinforcement Learning0
A Reduction from Reinforcement Learning to No-Regret Online Learning0
Are Gradient-based Saliency Maps Useful in Deep Reinforcement Learning?0
A Behavior Regularized Implicit Policy for Offline Reinforcement Learning0
A Regulation Enforcement Solution for Multi-agent Reinforcement Learning0
Show:102550
← PrevPage 193 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified