SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1255112600 of 15113 papers

TitleStatusHype
Deep reinforcement learning for scheduling in large-scale networked control systems0
Expressive Priors in Bayesian Neural Networks: Kernel Combinations and Periodic FunctionsCode0
Reinforcement Learning for Robotics and Control with Active Uncertainty Reduction0
Meta reinforcement learning as task inferenceCode0
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement LearningCode1
TauRieL: Targeting Traveling Salesman Problem with a deep reinforcement learning inspired architecture0
Trajectory-Based Off-Policy Deep Reinforcement LearningCode0
Successor Options: An Option Discovery Framework for Reinforcement LearningCode0
Variational Regret Bounds for Reinforcement Learning0
Combining Parametric and Nonparametric Models for Off-Policy Evaluation0
Control Regularization for Reduced Variance Reinforcement LearningCode0
Deep Multi-Agent Reinforcement Learning Based Cooperative Edge Caching in Wireless Networks0
Distributional Reinforcement Learning for Efficient Exploration0
Learning and Exploiting Multiple Subgoals for Fast Exploration in Hierarchical Reinforcement Learning0
CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic ScenarioCode0
Task-Agnostic Dynamics Priors for Deep Reinforcement LearningCode0
Multi-Agent Image Classification via Reinforcement LearningCode0
Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations0
Learning Phase Competition for Traffic Signal ControlCode0
Diagnosing Reinforcement Learning for Traffic Signal Control0
Graph Attention Memory for Visual Navigation0
Optimizing Routerless Network-on-Chip Designs: An Innovative Learning-Based Framework0
Intelligent User Association for Symbiotic Radio Networks using Deep Reinforcement Learning0
Domain Adversarial Reinforcement Learning for Partial Domain Adaptation0
Attention-based Deep Reinforcement Learning for Multi-view Environments0
Do Autonomous Agents Benefit from Hearing?0
Autonomous Management of Energy-Harvesting IoT Nodes Using Deep Reinforcement LearningCode0
Emergent Escape-based Flocking Behavior using Multi-Agent Reinforcement Learning0
Design of Artificial Intelligence Agents for Games using Deep Reinforcement Learning0
GAN-powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing0
On the Detection of Mutual Influences and Their Consideration in Reinforcement Learning Processes0
Multi-Pass Q-Networks for Deep Reinforcement Learning with Parameterised Action SpacesCode0
Reinforcement Learning in Non-Stationary Environments0
Toward Packet Routing with Fully-distributed Multi-agent Deep Reinforcement Learning0
A Reinforcement Learning Perspective on the Optimal Control of Mutation Probabilities for the (1+1) Evolutionary Algorithm: First Results on the OneMax Problem0
Path Design for Cellular-Connected UAV with Reinforcement Learning0
Pretrain Soft Q-Learning with Imperfect Demonstrations0
Learning to EvolveCode0
Longitudinal Dynamic versus Kinematic Models for Car-Following Control Using Deep Reinforcement Learning0
Accelerated Target Updates for Q-learning0
Toybox: A Suite of Environments for Experimental Evaluation of Deep Reinforcement LearningCode0
Object Exchangeability in Reinforcement Learning: Extended Abstract0
Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs0
Continual and Multi-task Reinforcement Learning With Shared Episodic Memory0
A Complementary Learning Systems Approach to Temporal Difference Learning0
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement LearningCode0
Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving0
Deep Ordinal Reinforcement LearningCode0
Learning to Control in Metric Space with Optimal RegretCode0
P3O: Policy-on Policy-off Policy OptimizationCode0
Show:102550
← PrevPage 252 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified