SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1460114650 of 15113 papers

TitleStatusHype
Toward negotiable reinforcement learning: shifting priorities in Pareto optimal sequential decision-making0
Non-Deterministic Policy Improvement Stabilizes Approximated Reinforcement Learning0
On the function approximation error for risk-sensitive reinforcement learning0
First-Person Activity Forecasting with Online Inverse Reinforcement Learning0
Loss is its own Reward: Self-Supervision for Reinforcement Learning0
A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to ImitationCode0
Unsupervised Perceptual Rewards for Imitation Learning0
Self-Correcting Models for Model-Based Reinforcement LearningCode0
Sample-efficient Deep Reinforcement Learning for Dialog Control0
Reinforcement Learning Using Quantum Boltzmann Machines0
Learning to predict where to look in interactive environments using deep recurrent q-learning0
A User Simulator for Task-Completion DialoguesCode0
An Alternative Softmax Operator for Reinforcement LearningCode1
Deep Reinforcement Learning with Successor Features for Navigation across Similar Environments0
Learning through Dialogue Interactions by Asking QuestionsCode2
Separation of Concerns in Reinforcement Learning0
Response to Comment on 'Perceptual Learning Incepted by Decoded fMRI Neurofeedback Without Stimulus Presentation'; How can a decoded neurofeedback method (DecNef) lead to successful reinforcement and visual perceptual learning?0
End-to-End Deep Reinforcement Learning for Lane Keeping Assist0
Incorporating Human Domain Knowledge into Large Scale Cost Function Learning0
Online Reinforcement Learning for Real-Time Exploration in Continuous State and Action Markov Decision Processes0
PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning0
Learning to Drive using Inverse Reinforcement Learning and Deep Q-Networks0
Reinforcement Learning With Temporal Logic Rewards0
Towards deep learning with spiking neurons in energy based models with contrastive Hebbian plasticity0
Towards Information-Seeking Agents0
Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning0
Hierarchy through Composition with Linearly Solvable Markov Decision Processes0
Cryptocurrency Portfolio Management with Deep Reinforcement LearningCode1
Learning to superoptimize programs - Workshop Version0
Deep Learning of Robotic Tasks without a Simulator using Strong and Weak Human Supervision0
Self-critical Sequence Training for Image CaptioningCode1
Showing versus doing: Teaching by demonstration0
Adaptive optimal training of animal behavior0
Linear Feature Encoding for Reinforcement Learning0
Bayesian Optimization with Robust Bayesian Neural NetworksCode0
Bootstrapping incremental dialogue systems: using linguistic knowledge to learn from minimal data0
Playing Doom with SLAM-Augmented Deep Reinforcement LearningCode0
Generalizing Skills with Semi-Supervised Reinforcement Learning0
Exploration for Multi-task Reinforcement Learning with Deep Generative Models0
Dialogue Learning With Human-In-The-LoopCode2
Neural Combinatorial Optimization with Reinforcement LearningCode1
Nonparametric General Reinforcement Learning0
Learning to Compose Words into Sentences with Reinforcement Learning0
Improving Policy Gradient by Exploring Under-appreciated Rewards0
Deep Reinforcement Learning for Multi-Domain Dialogue SystemsCode0
Training an Interactive Humanoid Robot Using Multimodal Deep Reinforcement LearningCode0
A Simple, Fast Diverse Decoding Algorithm for Neural GenerationCode0
Multiscale Inverse Reinforcement Learning using Diffusion Wavelets0
Recurrent Attention Models for Depth-Based Person Identification0
Variational Intrinsic ControlCode0
Show:102550
← PrevPage 293 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified