SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 45264550 of 15113 papers

TitleStatusHype
Attitude Control of Highly Maneuverable Aircraft Using an Improved Q-learning0
Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES0
Probing Transfer in Deep Reinforcement Learning without Task Engineering0
Towards Quantum-Enabled 6G Slicing0
Rate-Splitting for Intelligent Reflecting Surface-Aided Multiuser VR StreamingCode0
Epistemic Monte Carlo Tree Search0
On the connection between Bregman divergence and value in regularized Markov decision processes0
Implicit Offline Reinforcement Learning via Supervised Learning0
Continual Vision-based Reinforcement Learning with Group Symmetries0
Biologically Plausible Variational Policy Gradient with Spiking Recurrent Winner-Take-All NetworksCode0
Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables0
Deep Reinforcement Learning for Stabilization of Large-scale Probabilistic Boolean Networks0
Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities0
Deep Reinforcement Learning for Inverse Inorganic Materials Design0
Integrating Policy Summaries with Reward Decomposition for Explaining Reinforcement Learning Agents0
PaCo: Parameter-Compositional Multi-Task Reinforcement LearningCode1
Fine-Grained Session Recommendations in E-commerce using Deep Reinforcement Learning0
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes0
Robust Imitation via Mirror Descent Inverse Reinforcement Learning0
Model-based Lifelong Reinforcement Learning with Bayesian ExplorationCode0
MoCoDA: Model-based Counterfactual Data AugmentationCode1
The Pump Scheduling Problem: A Real-World Scenario for Reinforcement LearningCode0
Safe Policy Improvement in Constrained Markov Decision Processes0
Task Phasing: Automated Curriculum Learning from DemonstrationsCode0
RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator ControlCode1
Show:102550
← PrevPage 182 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified