SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1445114500 of 15113 papers

TitleStatusHype
End-to-end Active Object Tracking via Reinforcement Learning0
Experience Replay Using Transition Sequences0
Constrained Policy OptimizationCode0
Fine-grained acceleration control for autonomous intersection management using deep reinforcement learning0
Latent Intention Dialogue ModelsCode0
Free energy-based reinforcement learning using a quantum processorCode0
Boltzmann Exploration Done Right0
Role Playing Learning for Socially Concomitant Mobile Robot Navigation0
First-spike based visual categorization using reward-modulated STDP0
Cross-Domain Perceptual Reward Functions0
State Space Decomposition and Subgoal Creation for Transfer in Deep Reinforcement Learning0
Reinforcement Learning with a Corrupted Reward ChannelCode0
Visual Semantic Planning using Deep Successor Representations0
Safe Model-based Reinforcement Learning with Stability GuaranteesCode0
Continuous State-Space Models for Optimal Sepsis Treatment - a Deep Reinforcement Learning Approach0
Enhanced Experience Replay Generation for Efficient Reinforcement Learning0
Guide Actor-Critic for Continuous ControlCode0
AIXIjs: A Software Demo for General Reinforcement LearningCode0
A unified view of entropy-regularized Markov decision processes0
Ask the Right Questions: Active Question Reformulation with Reinforcement LearningCode0
Concrete DropoutCode0
Learning to Mix n-Step Returns: Generalizing lambda-Returns for Deep Reinforcement Learning0
Experience enrichment based task independent reward model0
Shallow Updates for Deep Reinforcement Learning0
Learning to Factor Policies and Action-Value Functions: Factored Action Space Representations for Deep Reinforcement learning0
Batch Reinforcement Learning on the Industrial Benchmark: First Experiences0
A Comparison of Reinforcement Learning Techniques for Fuzzy Cloud Auto-Scaling0
Atari games and Intel processors0
Posterior sampling for reinforcement learning: worst-case regret bounds0
Delving into adversarial attacks on deep policies0
Feature Control as Intrinsic Motivation for Hierarchical Reinforcement LearningCode0
Automatic Goal Generation for Reinforcement Learning AgentsCode0
New Reinforcement Learning Using a Chaotic Neural Network for Emergence of "Thinking" - "Exploration" Grows into "Thinking" through Learning -0
Repeated Inverse Reinforcement Learning0
Emotion in Reinforcement Learning Agents and Robots: A Survey0
Efficient Parallel Methods for Deep Reinforcement LearningCode0
Deep Episodic Value Iteration for Model-based Meta-Reinforcement Learning0
Policy Iterations for Reinforcement Learning Problems in Continuous Time and Space -- Fundamental Theory and MethodsCode0
Reinforced Mnemonic Reader for Machine Reading ComprehensionCode0
Experimental results : Reinforcement Learning of POMDPs using Spectral Methods0
Machine Comprehension by Text-to-Text Neural Question GenerationCode0
Answer Set Programming for Non-Stationary Markov Decision Processes0
Navigating Occluded Intersections with Autonomous Vehicles using Deep Reinforcement Learning0
Learning Multimodal Transition Dynamics for Model-Based Reinforcement LearningCode0
Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning0
Mapping Instructions and Visual Observations to Actions with Reinforcement LearningCode0
On Improving Deep Reinforcement Learning for POMDPsCode0
Reinforcement Learning-based Thermal Comfort Control for Vehicle Cabins0
Molecular De Novo Design through Deep Reinforcement LearningCode0
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal LikelihoodCode0
Show:102550
← PrevPage 290 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified