SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1445114500 of 15113 papers

TitleStatusHype
Deep Episodic Value Iteration for Model-based Meta-Reinforcement Learning0
Reinforced Mnemonic Reader for Machine Reading ComprehensionCode0
Experimental results : Reinforcement Learning of POMDPs using Spectral Methods0
Machine Comprehension by Text-to-Text Neural Question GenerationCode0
Answer Set Programming for Non-Stationary Markov Decision Processes0
Navigating Occluded Intersections with Autonomous Vehicles using Deep Reinforcement Learning0
Learning Multimodal Transition Dynamics for Model-Based Reinforcement LearningCode0
Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning0
Mapping Instructions and Visual Observations to Actions with Reinforcement LearningCode0
On Improving Deep Reinforcement Learning for POMDPsCode0
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal LikelihoodCode0
Molecular De Novo Design through Deep Reinforcement LearningCode0
Reinforcement Learning-based Thermal Comfort Control for Vehicle Cabins0
Stochastic Constraint Programming as Reinforcement Learning0
Time-Contrastive Networks: Self-Supervised Learning from VideoCode1
Equivalence Between Policy Gradients and Soft Q-Learning0
Modular Multi-Objective Deep Reinforcement Learning with Decision ValuesCode0
Reinforcement Learning with External Knowledge and Two-Stage Q-functions for Predicting Popular Reddit Threads0
A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units0
Investigating Recurrence and Eligibility Traces in Deep Q-Networks0
Beating Atari with Natural Language Guided Reinforcement LearningCode0
Pseudorehearsal in actor-critic agents0
Effective Warm Start for the Online Actor-Critic Reinforcement Learning based mHealth Intervention0
Task-Oriented Query Reformulation with Reinforcement LearningCode0
MUSE: Modularizing Unsupervised Sense EmbeddingsCode0
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning0
Ultrafast photonic reinforcement learning based on laser chaos0
Environment-Independent Task Specifications via GLTL0
Optimizing Differentiable Relaxations of Coreference Evaluation MetricsCode0
Virtual to Real Reinforcement Learning for Autonomous DrivingCode0
Deep Reinforcement Learning-based Image Captioning with Embedding Reward0
Deep Q-learning from DemonstrationsCode0
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation0
Stochastic Neural Networks for Hierarchical Reinforcement LearningCode0
Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning0
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning0
Deep Reinforcement Learning framework for Autonomous DrivingCode0
Stein Variational Policy Gradient0
Finite Sample Analyses for TD(0) with Function Approximation0
On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning0
Multi-Advisor Reinforcement Learning0
Evaluating Persuasion Strategies and Deep Reinforcement Learning methods for Negotiation Dialogue agents0
Integrated Learning of Dialog Strategies and Semantic Parsing0
Learning Visual Servoing with Deep Features and Fitted Q-IterationCode0
Sentence Simplification with Deep Reinforcement LearningCode0
Enter the Matrix: Safely Interruptible Autonomous Systems via Virtualization0
Dynamic Computational Time for Visual AttentionCode0
Inverse Risk-Sensitive Reinforcement Learning0
Inverse Reinforcement Learning from Summary Data0
Socially Aware Motion Planning with Deep Reinforcement LearningCode0
Show:102550
← PrevPage 290 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified