SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1485114900 of 15113 papers

TitleStatusHype
Learning to Compose Neural Networks for Question AnsweringCode0
Angrier Birds: Bayesian reinforcement learningCode0
Taming the Noise in Reinforcement Learning via Soft UpdatesCode0
Inverse Reinforcement Learning via Deep Gaussian Process0
Deep Reinforcement Learning in Large Discrete Action SpacesCode0
An Empirical Comparison of Neural Architectures for Reinforcement Learning in Partially Observable Environments0
Increasing the Action Gap: New Operators for Reinforcement LearningCode0
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies0
Deep Attention Recurrent Q-NetworkCode0
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria0
Q-Networks for Binary Vector Actions0
State of the Art Control of Atari Games Using Shallow Reinforcement LearningCode0
Multi-Class Multi-Annotator Active Learning With Robust Gaussian Process for Visual Recognition0
Inverse Reinforcement Learning with Locally Consistent Reward Functions0
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models0
Reinforcement Learning Applied to an Electric Water Heater: From Theory to Practice0
Robotic Search & Rescue via Online Multi-task Reinforcement Learning0
On the convergence of cycle detection for navigational reinforcement learning0
Strategic Dialogue Management via Deep Reinforcement LearningCode0
MazeBase: A Sandbox for Learning from GamesCode0
Dueling Network Architectures for Deep Reinforcement LearningCode0
Conditional Computation in Neural Networks for faster modelsCode0
Actor-Mimic: Deep Multitask and Transfer Reinforcement LearningCode0
Policy DistillationCode0
Active Object Localization with Deep Reinforcement LearningCode0
Deep Reinforcement Learning with a Natural Language Action SpaceCode0
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control0
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning0
Learning Unfair Trading: a Market Manipulation Analysis From the Reinforcement Learning Perspective0
Generating Text with Deep Reinforcement Learning0
Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning0
On the Computability of AIXI0
Dual Control for Approximate Bayesian Reinforcement Learning0
Data-Efficient Learning of Feedback Policies from Image Pixels using Deep Dynamical Models0
Variational Information Maximisation for Intrinsically Motivated Reinforcement LearningCode0
One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors0
Model-based Reinforcement Learning with Parametrized Physical Models and Optimism-Driven Exploration0
Learning Deep Control Policies for Autonomous Aerial Vehicles with MPC-Guided Policy Search0
Deep Spatial Autoencoders for Visuomotor LearningCode0
Benchmarking for Bayesian Reinforcement Learning0
Optimization of anemia treatment in hemodialysis patients via reinforcement learning0
Recurrent Reinforcement Learning: A Hybrid Approach0
Compatible Value Gradients for Reinforcement Learning of Continuous Deep Policies0
Reinforcement Learning with Parameterized ActionsCode0
Reinforcement Learning in Multi-Party Trading Dialog0
Reinforcement Learning of Multi-Issue Negotiation Dialogue Policies0
Optimising Turn-Taking Strategies With Reinforcement Learning0
Hyper-parameter Optimisation of Gaussian Process Reinforcement Learning for Statistical Dialogue Management0
A Cognitive Architecture Based on a Learning Classifier System with Spiking Classifiers0
Learning Efficient Representations for Reinforcement Learning0
Show:102550
← PrevPage 298 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified