SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1455114600 of 15113 papers

TitleStatusHype
Bridging the Gap Between Value and Policy Based Reinforcement Learning0
Stabilising Experience Replay for Deep Multi-Agent Reinforcement LearningCode1
Neural Map: Structured Memory for Deep Reinforcement LearningCode0
A Dataset for Developing and Benchmarking Active Vision0
Reinforcement Learning with Deep Energy-Based PoliciesCode0
Learning Control for Air Hockey Striking using Deep Reinforcement Learning0
Stochastic Variance Reduction Methods for Policy Evaluation0
Robot gains Social Intelligence through Multimodal Deep Reinforcement Learning0
Online Meta-learning by Parallel Algorithm Competition0
Changing Model Behavior at Test-Time Using Reinforcement Learning0
Control of Gene Regulatory Networks with Noisy Measurements and Uncertain Inputs0
Automatic Representation for Lifetime Value Recommender Systems0
Data Distillation for Controlling Specificity in Dialogue Generation0
Tackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency ParsingCode0
Real-time visual tracking by deep reinforced decision makingCode0
Reinforcement Learning Based Argument Component Detection0
Towards a Common Implementation of Reinforcement Learning for Multiple Robotic TasksCode0
Active One-shot LearningCode0
Beating the World's Best at Super Smash Bros. with Deep Reinforcement LearningCode0
Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning0
Collaborative Deep Reinforcement LearningCode0
Collaborative Deep Reinforcement Learning for Joint Object Search0
Batch Policy Gradient Methods for Improving Neural Conversation Models0
Multi-agent Reinforcement Learning in Sequential Social DilemmasCode1
Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learningCode0
Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning0
Autonomous Braking System via Deep Reinforcement LearningCode0
Semi-Supervised QA with Generative Domain-Adaptive Nets0
Uncertainty-Aware Reinforcement Learning for Collision Avoidance0
Deep Reinforcement Learning for Robotic Manipulation-The state of the art0
Deep Reinforcement Learning for Visual Object Tracking in Videos0
Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning0
PathNet: Evolution Channels Gradient Descent in Super Neural NetworksCode0
Flow Navigation by Smart Microswimmers via Reinforcement Learning0
Reinforcement Learning Algorithm Selection0
Learning Light Transport the Reinforced WayCode0
Deep Reinforcement Learning: An OverviewCode0
Artificial Intelligence Approaches To UCAV Autonomy0
Regularizing Neural Networks by Penalizing Confident Output DistributionsCode0
Adversarial Learning for Neural Dialogue GenerationCode0
Binary Matrix Guessing Problem0
Basic protocols in quantum reinforcement learning with superconducting circuits0
Vulnerability of Deep Reinforcement Learning to Policy Induction AttacksCode0
Near Optimal Behavior via Approximate State AbstractionCode0
Agent-Agnostic Human-in-the-Loop Reinforcement Learning0
Scalable and Incremental Learning of Gaussian Mixture Models0
Real-Time Bidding by Reinforcement Learning in Display AdvertisingCode0
Reinforcement Learning via Recurrent Convolutional Neural NetworksCode0
Reinforcement Learning based Embodied Agents Modelling Human Users Through Interaction and Multi-Sensory Perception0
A Review of Neural Network Based Machine Learning Approaches for Rotor Angle Stability Control0
Show:102550
← PrevPage 292 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified