SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 70017050 of 15113 papers

TitleStatusHype
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks0
ReLAX: Reinforcement Learning Agent eXplainer for Arbitrary Predictive ModelsCode0
A Versatile and Efficient Reinforcement Learning Framework for Autonomous DrivingCode1
Reinforcement Learning for Process Control with Application in Semiconductor Manufacturing0
Patient level simulation and reinforcement learning to discover novel strategies for treating ovarian cancer0
Model-based Reinforcement Learning for Service Mesh Fault Resiliency in a Web Application-level0
Reinforcement Learning Based Optimal Camera Placement for Depth Observation of Indoor Scenes0
Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain0
Sequential Voting with Relational Box Fields for Active Object DetectionCode1
Anti-Concentrated Confidence Bonuses for Scalable Exploration0
Is High Variance Unavoidable in RL? A Case Study in Continuous Control0
Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information0
Deep Reinforcement Learning for Online Control of Stochastic Partial Differential Equations0
Can Q-learning solve Multi Armed Bantids?0
LOA: Logical Optimal Actions for Text-based Interaction GamesCode1
Neuro-Symbolic Reinforcement Learning with First-Order Logic0
Playing 2048 With Reinforcement LearningCode0
Computationally Efficient Safe Reinforcement Learning for Power Systems0
Feedback Linearization of Car Dynamics for Racing via Reinforcement Learning0
Distributed Reinforcement Learning for Privacy-Preserving Dynamic Edge Caching0
Hierarchical Skills for Efficient ExplorationCode1
Socialbots on Fire: Modeling Adversarial Behaviors of Socialbots via Multi-Agent Hierarchical Reinforcement Learning0
More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences0
Transferring Reinforcement Learning for DC-DC Buck Converter Control via Duty Ratio Mapping: From Simulation to Implementation0
Improved cooperation by balancing exploration and exploitation in intertemporal social dilemma tasks0
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization0
Contrastive Active InferenceCode1
CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning AgentsCode1
Balancing Value Underestimation and Overestimation with Realistic Actor-CriticCode0
Continuous Control with Action Quantization from Demonstrations0
Aesthetic Photo Collage with Deep Reinforcement Learning0
Learning Robotic Manipulation Skills Using an Adaptive Force-Impedance Action Space0
Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes0
On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game0
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm0
Offline Reinforcement Learning with Value-based Episodic MemoryCode1
State-based Episodic Memory for Multi-Agent Reinforcement Learning0
RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender SystemCode1
Embracing advanced AI/ML to help investors achieve success: Vanguard Reinforcement Learning for Financial Goal Planning0
An actor-critic algorithm with policy gradients to solve the job shop scheduling problem using deep double recurrent agentsCode1
Improving Robustness of Reinforcement Learning for Power System Control with Adversarial Training0
Edge Rewiring Goes Neural: Boosting Network Resilience without Rich FeaturesCode1
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs0
No RL, No Simulation: Learning to Navigate without NavigatingCode1
Option Transfer and SMDP Abstraction with Successor Features0
Sim-to-Real Transfer in Multi-agent Reinforcement Networking for Federated Edge Computing0
Reinforcement Learning-Based Coverage Path Planning with Implicit Cellular Decomposition0
Provable Hierarchy-Based Meta-Reinforcement Learning0
Accelerating lifelong reinforcement learning via reshaping rewardsCode1
Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization0
Show:102550
← PrevPage 141 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified