SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 51015150 of 15113 papers

TitleStatusHype
A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes0
Unified Automatic Control of Vehicular Systems with Reinforcement LearningCode1
Solving the vehicle routing problem with deep reinforcement learning0
Reinforcement learning with experience replay and adaptation of action dispersion0
Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation and Complexity AnalysisCode0
Sample-efficient Safe Learning for Online Nonlinear Control with Control Barrier Functions0
Meta Reinforcement Learning with Successor Feature Based Context0
Combining Evolutionary Search with Behaviour Cloning for Procedurally Generated Content0
Cyclic Policy Distillation: Sample-Efficient Sim-to-Real Reinforcement Learning with Domain RandomizationCode0
Deep Reinforcement Learning for System-on-Chip: Myths and Realities0
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement LearningCode1
Graph Inverse Reinforcement Learning from Diverse Videos0
Latent Properties of Lifelong Learning Systems0
RangL: A Reinforcement Learning Competition Platform0
Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning0
Raising Student Completion Rates with Adaptive Curriculum and Contextual Bandits0
POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning0
Multi-Objective Provisioning of Network Slices using Deep Reinforcement Learning0
Structural Similarity for Improved Transfer in Reinforcement Learning0
Distributional Actor-Critic Ensemble for Uncertainty-Aware Continuous Control0
Dynamic Shielding for Reinforcement Learning in Black-Box Environments0
A Contact-Safe Reinforcement Learning Framework for Contact-Rich Robot Manipulation0
Safe and Robust Experience Sharing for Deterministic Policy Gradient AlgorithmsCode0
Unsupervised Training for Neural TSP Solver0
Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-based Policy Learning0
Learning Bipedal Walking On Planned Footsteps For Humanoid RobotsCode3
Offline Reinforcement Learning at Multiple Frequencies0
Semi-analytical Industrial Cooling System Model for Reinforcement Learning0
Planning and Learning: Path-Planning for Autonomous Vehicles, a Review of the Literature0
Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning0
Cooperative Actor-Critic via TD Error Aggregation0
Flowsheet synthesis through hierarchical reinforcement learning and graph neural networks0
Live in the Moment: Learning Dynamics Model Adapted to Evolving PolicyCode0
Online Reinforcement Learning for Periodic MDP0
Adaptive Asynchronous Control Using Meta-learned Neural Ordinary Differential Equations0
Post-processing Networks: Method for Optimizing Pipeline Task-oriented Dialogue Systems using Reinforcement LearningCode0
REPNP: Plug-and-Play with Deep Reinforcement Learning Prior for Robust Image Restoration0
Lifelong Machine Learning of Functionally Compositional StructuresCode1
Learning Soccer Juggling Skills with Layer-wise Mixture-of-ExpertsCode1
Adaptive Decision Making at the Intersection for Autonomous Vehicles Based on Skill Discovery0
Anti-Overestimation Dialogue Policy Learning for Task-Completion Dialogue System0
Driver Dojo: A Benchmark for Generalizable Reinforcement Learning for Autonomous DrivingCode1
Halftoning with Multi-Agent Deep Reinforcement Learning0
Epersist: A Self Balancing Robot Using PID Controller And Deep Reinforcement Learning0
Hierarchical Kickstarting for Skill Transfer in Reinforcement LearningCode1
Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution0
Robust Knowledge Adaptation for Dynamic Graph Neural NetworksCode1
Towards Robust On-Ramp Merging via Augmented Multimodal Reinforcement Learning0
Solving the optimal stopping problem with reinforcement learning: an application in financial option exerciseCode0
Strategising template-guided needle placement for MR-targeted prostate biopsy0
Show:102550
← PrevPage 103 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified