SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1465114700 of 15113 papers

TitleStatusHype
Memory Lens: How Much Memory Does an Agent Use?0
Options Discovery with Budgeted Reinforcement Learning0
A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games0
Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPUCode0
Learning to reinforcement learnCode0
Reinforcement Learning with Unsupervised Auxiliary TasksCode0
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement LearningCode1
A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based ModelsCode0
Hierarchical Object Detection with Deep Reinforcement LearningCode0
Learning to Navigate in Complex EnvironmentsCode0
Reinforcement Learning in Rich-Observation MDPs using Spectral Methods0
RL^2: Fast Reinforcement Learning via Slow Reinforcement LearningCode0
Fairness in Reinforcement Learning0
Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control0
Reinforcement Learning Approach for Parallelization in Filters Aggregation Based Feature Selection Algorithms0
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy CriticCode0
Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning0
Designing Neural Network Architectures using Reinforcement LearningCode0
Modular Multitask Reinforcement Learning with Policy SketchesCode0
Learning to Perform Physics Experiments via Deep Reinforcement Learning0
Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality TighteningCode0
Neural Architecture Search with Reinforcement LearningCode0
Multi-task learning with deep model based reinforcement learning0
Using a Deep Reinforcement Learning Agent for Traffic Signal Control0
Quantile Reinforcement Learning0
Learning Locomotion Skills Using DeepRL: Does the Choice of Action Space Matter?0
Sample Efficient Actor-Critic with Experience ReplayCode1
Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear0
Learning Runtime Parameters in Computer Systems with Delayed Experience Injection0
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable0
Quantum-enhanced machine learning0
Reinforcement Learning in Conflicting Environments for Autonomous Vehicles0
Utilization of Deep Reinforcement Learning for saccadic-based object visual search0
A Reinforcement Learning Approach to the View Planning Problem0
Particle Swarm Optimization for Generating Interpretable Fuzzy Reinforcement Learning Policies0
Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data0
The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits0
Reset-free Trial-and-Error Learning for Robot Damage RecoveryCode0
Sim-to-Real Robot Learning from Pixels with Progressive Nets0
Introduction to the "Industrial Benchmark"0
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving0
Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation0
Personalizing a Dialogue System with Transfer Reinforcement Learning0
Multi-Objective Deep Reinforcement LearningCode0
Deep Reinforcement Learning From Raw Pixels in Doom0
Active exploration in parameterized reinforcement learningCode0
Towards Cognitive Exploration through Deep Reinforcement Learning for Mobile Robots0
Connecting Generative Adversarial Networks and Actor-Critic Methods0
Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States0
Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search0
Show:102550
← PrevPage 294 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified