SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 80268050 of 15113 papers

TitleStatusHype
Model-based Reinforcement Learning for Service Mesh Fault Resiliency in a Web Application-level0
Reinforcement Learning Based Optimal Camera Placement for Depth Observation of Indoor Scenes0
More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences0
Playing 2048 With Reinforcement LearningCode0
Transferring Reinforcement Learning for DC-DC Buck Converter Control via Duty Ratio Mapping: From Simulation to Implementation0
Computationally Efficient Safe Reinforcement Learning for Power Systems0
Socialbots on Fire: Modeling Adversarial Behaviors of Socialbots via Multi-Agent Hierarchical Reinforcement Learning0
Distributed Reinforcement Learning for Privacy-Preserving Dynamic Edge Caching0
Feedback Linearization of Car Dynamics for Racing via Reinforcement Learning0
Learning Robotic Manipulation Skills Using an Adaptive Force-Impedance Action Space0
Balancing Value Underestimation and Overestimation with Realistic Actor-CriticCode0
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization0
Continuous Control with Action Quantization from Demonstrations0
Aesthetic Photo Collage with Deep Reinforcement Learning0
Improved cooperation by balancing exploration and exploitation in intertemporal social dilemma tasks0
Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes0
State-based Episodic Memory for Multi-Agent Reinforcement Learning0
On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game0
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm0
Sim-to-Real Transfer in Multi-agent Reinforcement Networking for Federated Edge Computing0
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs0
Provable Hierarchy-Based Meta-Reinforcement Learning0
Reinforcement Learning-Based Coverage Path Planning with Implicit Cellular Decomposition0
Option Transfer and SMDP Abstraction with Successor Features0
Improving Robustness of Reinforcement Learning for Power System Control with Adversarial Training0
Show:102550
← PrevPage 322 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified