SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1475114800 of 15113 papers

TitleStatusHype
Reinforcement Learning algorithms for regret minimization in structured Markov Decision Processes0
Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies0
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems0
Perceptual Reward Functions0
On Lower Bounds for Regret in Reinforcement Learning0
Posterior Sampling for Reinforcement Learning Without EpisodesCode0
Neuroevolution-Based Inverse Reinforcement Learning0
Online Adaptation of Deep Architectures with Reinforcement Learning0
Discovering Latent States for Model Learning: Applying Sensorimotor Contingencies Theory and Predictive Processing to Model Context0
Self-organization in a distributed coordination game through heuristic rules0
A Sensorimotor Reinforcement Learning Framework for Physical Human-Robot Interaction0
Accelerating Stochastic Composition Optimization0
An Actor-Critic Algorithm for Sequence PredictionCode0
Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint ReplayCode0
Sequential Cost-Sensitive Feature Acquisition0
Automatic Bridge Bidding Using Deep Reinforcement Learning0
A Greedy Approach to Adapting the Trace Parameter for Temporal Difference LearningCode0
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?0
Is the Bellman residual a bad proxy?0
Unsupervised preprocessing for Tactile Data0
Simultaneous Control and Human Feedback in the Training of a Robotic Agent with Actor-Critic Reinforcement Learning0
A Hierarchical Reinforcement Learning Method for Persistent Time-Sensitive Tasks0
On Reward Function for Survival0
Successor Features for Transfer in Reinforcement Learning0
Deep Reinforcement Learning Discovers Internal Models0
Deep Reinforcement Learning With Macro-Actions0
Natural Language Generation as Planning under Uncertainty Using Reinforcement Learning0
Model-Free Episodic ControlCode0
Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit ThreadsCode0
Policy Networks with Two-Stage Training for Dialogue Systems0
Cooperative Inverse Reinforcement LearningCode0
Face valuing: Training user interfaces with facial expressions and reinforcement learning0
Continuously Learning Neural Dialogue Management0
Deep Successor Reinforcement LearningCode0
Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement LearningCode0
Safe and Efficient Off-Policy Reinforcement LearningCode0
Adapting Sampling Interval of Sensor Networks Using On-Line Reinforcement Learning0
Learning to Optimize0
Unifying Count-Based Exploration and Intrinsic MotivationCode0
Deep Reinforcement Learning for Dialogue GenerationCode0
Deep Q-Networks for Accelerating the Training of Deep Neural Networks0
End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning0
Difference of Convex Functions Programming Applied to Control with Expert Data0
Reinforcement Learning for Semantic Segmentation in Indoor Scenes0
Death and Suicide in Universal Artificial Intelligence0
Reinforcement Learning for Visual Object Detection0
VIME: Variational Information Maximizing ExplorationCode0
Information Theoretically Aided Reinforcement Learning for Embodied Agents0
Control of Memory, Active Perception, and Action in Minecraft0
Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL AgentCode0
Show:102550
← PrevPage 296 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified