SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 64016450 of 15113 papers

TitleStatusHype
DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV0
Backward Imitation and Forward Reinforcement Learning via Bi-directional Model Rollouts0
Human Decision Makings on Curriculum Reinforcement Learning with Difficulty Adjustment0
Towards Augmented Microscopy with Reinforcement Learning-Enhanced WorkflowsCode0
Transferable Multi-Agent Reinforcement Learning with Dynamic Participating Agents0
Supervised and Reinforcement Learning from Observations in Reconnaissance Blind Chess0
Reinforcement Learning for Joint V2I Network Selection and Autonomous Driving Policies0
AACC: Asymmetric Actor-Critic in Contextual Reinforcement Learning0
Deep VULMAN: A Deep Reinforcement Learning-Enabled Cyber Vulnerability Management Framework0
A Lightweight Transmission Parameter Selection Scheme Using Reinforcement Learning for LoRaWAN0
Joint Sensing and Communications for Deep Reinforcement Learning-based Beam Management in 6G0
Chemotaxis of sea urchin sperm cells through deep reinforcement learning0
Digital Twin-Assisted Efficient Reinforcement Learning for Edge Task Scheduling0
Smart caching in a Data Lake for High Energy Physics analysis0
Mitigating Off-Policy Bias in Actor-Critic Methods with One-Step Q-learning: A Novel Correction ApproachCode0
VacciNet: Towards a Smart Framework for Learning the Distribution Chain Optimization of Vaccines for a Pandemic0
Retrieval of surgical phase transitions using reinforcement learning0
Hierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal Robot0
Learning to Grasp on the Moon from 3D Octree Observations with Deep Reinforcement Learning0
A Maintenance Planning Framework using Online and Offline Deep Reinforcement Learning0
Learning to generate Reliable Broadcast Algorithms0
Robot Policy Learning from Demonstration Using Advantage Weighting and Early Termination0
Using Chatbots to Teach Languages0
Solving the vehicle routing problem with deep reinforcement learning0
Reinforcement learning with experience replay and adaptation of action dispersion0
A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes0
Deep Reinforcement Learning for System-on-Chip: Myths and Realities0
Cyclic Policy Distillation: Sample-Efficient Sim-to-Real Reinforcement Learning with Domain RandomizationCode0
Combining Evolutionary Search with Behaviour Cloning for Procedurally Generated Content0
Meta Reinforcement Learning with Successor Feature Based Context0
Sample-efficient Safe Learning for Online Nonlinear Control with Control Barrier Functions0
Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation and Complexity AnalysisCode0
Raising Student Completion Rates with Adaptive Curriculum and Contextual Bandits0
Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning0
RangL: A Reinforcement Learning Competition Platform0
Latent Properties of Lifelong Learning Systems0
Graph Inverse Reinforcement Learning from Diverse Videos0
Dynamic Shielding for Reinforcement Learning in Black-Box Environments0
Distributional Actor-Critic Ensemble for Uncertainty-Aware Continuous Control0
A Contact-Safe Reinforcement Learning Framework for Contact-Rich Robot Manipulation0
POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning0
Structural Similarity for Improved Transfer in Reinforcement Learning0
Multi-Objective Provisioning of Network Slices using Deep Reinforcement Learning0
Safe and Robust Experience Sharing for Deterministic Policy Gradient AlgorithmsCode0
Unsupervised Training for Neural TSP Solver0
Semi-analytical Industrial Cooling System Model for Reinforcement Learning0
Offline Reinforcement Learning at Multiple Frequencies0
Planning and Learning: Path-Planning for Autonomous Vehicles, a Review of the Literature0
Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-based Policy Learning0
Live in the Moment: Learning Dynamics Model Adapted to Evolving PolicyCode0
Show:102550
← PrevPage 129 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified