SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1225112300 of 15113 papers

TitleStatusHype
Voting-Based Multi-Agent Reinforcement Learning for Intelligent IoT0
Modified Actor-Critics0
Generalizing from a few environments in safety-critical reinforcement learning0
A Reinforcement Learning Approach for the Multichannel Rendezvous Problem0
Dynamic Face Video Segmentation via Reinforcement Learning0
Conservative Q-Improvement: Reinforcement Learning for an Interpretable Decision-Tree PolicyCode0
Learning How to Active Learn by DreamingCode0
Look Harder: A Neural Machine Translation Model with Hard Attention0
Using Semantic Similarity as Reward for Reinforcement Learning in Sentence Generation0
Reinforced Training Data Selection for Domain Adaptation0
Historical Text Normalization with Delayed Rewards0
End-to-end Deep Reinforcement Learning Based Coreference Resolution0
Designing Deep Reinforcement Learning for Human Parameter Exploration0
Learning World Graphs to Accelerate Hierarchical Reinforcement Learning0
FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control0
Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable ModelCode0
On mechanisms for transfer using landmark value functions in multi-task lifelong reinforcement learning0
Variational Quantum Circuits for Deep Reinforcement LearningCode0
Multiple Landmark Detection using Multi-Agent Reinforcement LearningCode0
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog0
Collaboration of AI Agents via Cooperative Multi-Agent Deep Reinforcement Learning0
Detecting Spiky Corruption in Markov Decision ProcessesCode0
On Training Flexible Robots using Deep Reinforcement Learning0
Growing Action SpacesCode0
Learning-based Model Predictive Control for Safe Exploration and Reinforcement LearningCode0
Hyp-RL : Hyperparameter Optimization by Reinforcement LearningCode0
From self-tuning regulators to reinforcement learning and back again0
Adaptive Honeypot Engagement through Reinforcement Learning of Semi-Markov Decision Processes0
Demonstration-Guided Deep Reinforcement Learning of Control Policies for Dexterous Human-Robot Interaction0
QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security GameCode0
Toward Simulating Environments in Reinforcement Learning Based Recommendations0
Compositional Transfer in Hierarchical Reinforcement Learning0
PyRep: Bringing V-REP to Deep Robot LearningCode0
Towards Empathic Deep Q-LearningCode0
Approximate Dynamic Programming For Linear Systems with State and Input Constraints0
Cooperation-Aware Reinforcement Learning for Merging in Dense TrafficCode0
A Tractable Algorithm For Finite-Horizon Continuous Reinforcement Learning0
Efficient Navigation of Colloidal Robots in an Unknown Environment via Deep Reinforcement Learning0
Probabilistic model predictive safety certification for learning-based control0
Policy Optimization with Stochastic Mirror Descent0
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives0
On Multi-Agent Learning in Team Sports Games0
Optimistic Proximal Policy Optimization0
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy0
Multi-Agent Deep Reinforcement Learning for Liquidation Strategy AnalysisCode1
Event-Driven Models0
A Theoretical Connection Between Statistical Physics and Reinforcement Learning0
Deceptive Reinforcement Learning Under Adversarial Manipulations on Cost Signals0
Deep Conservative Policy Iteration0
Inverse reinforcement learning conditioned on brain scan0
Show:102550
← PrevPage 246 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified