SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1325113300 of 15113 papers

TitleStatusHype
Learning First-to-Spike Policies for Neuromorphic Control Using Policy Gradients0
Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space0
Graph Convolutional Reinforcement LearningCode0
Multi-Agent Actor-Critic with Generative Cooperative Policy Network0
Risk-Sensitive Reinforcement Learning via Policy Gradient Search0
RLgraph: Modular Computation Graphs for Deep Reinforcement LearningCode0
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments0
Autonomous Self-Explanation of Behavior for Interactive Reinforcement Learning Agents0
Safe Reinforcement Learning with Model Uncertainty Estimates0
Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement LearningCode1
Optimization of Molecules via Deep Reinforcement LearningCode1
Finding the best design parameters for optical nanostructures using reinforcement learning0
Applications of Deep Reinforcement Learning in Communications and Networking: A Survey0
Fast deep reinforcement learning using online adjustments from the pastCode0
At Human Speed: Deep Reinforcement Learning with Action Delay0
Reinforcement Learning Decoders for Fault-Tolerant Quantum ComputationCode0
Learning Socially Appropriate Robot Approaching Behavior Toward Groups using Deep Reinforcement LearningCode0
The Concept of Criticality in Reinforcement Learning0
Multi-Stage Reinforcement Learning For Object DetectionCode0
Using Deep Reinforcement Learning for the Continuous Control of Robotic Arms0
Successor Uncertainties: Exploration and Uncertainty in Temporal Difference LearningCode0
Deep Transfer Reinforcement Learning for Text SummarizationCode0
CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement LearningCode0
Deep Reinforcement LearningCode3
Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost0
Assessing the Potential of Classical Q-learning in General Game PlayingCode0
Two Can Play That Game: An Adversarial Evaluation of a Cyber-alert Inspection System0
Optimal Hierarchical Learning Path Design with Reinforcement Learning0
A Survey and Critique of Multiagent Deep Reinforcement Learning0
Bayesian Inference of Self-intention Attributed by Observer0
Empowerment-driven Exploration using Mutual Information EstimationCode0
Adversarial Text Generation Without Reinforcement Learning0
Closed-form approximations in multi-asset market making0
Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action SpaceCode1
The Laplacian in RL: Learning Representations with Efficient Approximations0
Investigating Enactive Learning for Autonomous Intelligent Agents0
Distributed Wildfire Surveillance with Autonomous Aircraft using Deep Reinforcement Learning0
Enabling Cognitive Smart Cities Using Big Data and Machine Learning: Approaches and Challenges0
Continual State Representation Learning for Reinforcement Learning using Generative Replay0
A Distributed Reinforcement Learning Solution With Knowledge Transfer Capability for A Bike Rebalancing Problem0
Discovering General-Purpose Active Learning StrategiesCode0
Semi-supervised Deep Reinforcement Learning in Support of IoT and Smart City ServicesCode0
Reinforcement Learning for Improving Agent DesignCode0
SFV: Reinforcement Learning of Physical Skills from VideosCode0
Multi-agent Deep Reinforcement Learning for Zero Energy Communities0
Fast Context Adaptation via Meta-LearningCode1
Actor-Critic Deep Reinforcement Learning for Dynamic Multichannel Access0
Reinforcement Evolutionary Learning Method for self-learning0
Scaling All-Goals Updates in Reinforcement Learning Using Convolutional Neural NetworksCode0
PPO-CMA: Proximal Policy Optimization with Covariance Matrix AdaptationCode0
Show:102550
← PrevPage 266 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified