SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1440114450 of 15113 papers

TitleStatusHype
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy GamesCode0
Maintaining cooperation in complex social dilemmas using deep reinforcement learning0
OPEB: Open Physical Environment Benchmark for Artificial Intelligence0
Efficient Probabilistic Performance Bounds for Inverse Reinforcement LearningCode0
Hashing over Predicted Future Frames for Informed Exploration of Deep Reinforcement Learning0
Grammatical Error Correction with Neural Reinforcement Learning0
Action-Decision Networks for Visual Tracking With Deep Reinforcement LearningCode0
Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management0
Neural Sequence Model Training via α-divergence MinimizationCode0
Noisy Networks for ExplorationCode0
Neural SLAM: Learning to Explore with External MemoryCode0
Path Integral Networks: End-to-End Differentiable Optimal Control0
Actor-Critic Sequence Training for Image Captioning0
Learning to Learn: Meta-Critic Networks for Sample Efficient Learning0
Interpretability via Model Extraction0
Uncertainty Decomposition in Bayesian Neural Networks with Latent Variables0
Count-Based Exploration in Feature Space for Reinforcement LearningCode0
Temporal-related Convolutional-Restricted-Boltzmann-Machine capable of learning relational order via reinforcement learning procedure?0
A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement LearningCode0
Structure Learning in Motor Control:A Deep Reinforcement Learning Model0
Observational Learning by Reinforcement Learning0
Toward Real-Time Decentralized Reinforcement Learning using Finite Support Basis Functions0
Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines0
Data-Efficient Reinforcement Learning with Probabilistic Model Predictive ControlCode0
Dex: Incremental Learning for Complex Environments in Deep Reinforcement LearningCode0
Pedestrian Prediction by Planning using Deep Neural Networks0
Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning0
Reinforcement Learning under Model Mismatch0
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement LearningCode0
Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equationsCode0
Reinforcement Learning with Budget-Constrained Nonparametric Function Approximation for Opportunistic Spectrum Access0
On Optimistic versus Randomized Exploration in Reinforcement Learning0
Hybrid Reward Architecture for Reinforcement LearningCode0
Device Placement Optimization with Reinforcement LearningCode0
Deep reinforcement learning from human preferencesCode0
ACCNet: Actor-Coordinator-Critic Net for "Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning0
Symmetry Learning for Function Approximation in Reinforcement Learning0
Unlocking the Potential of Simulators: Design with RL in Mind0
Efficient Reinforcement Learning via Initial Pure Exploration0
Parameter Space Noise for ExplorationCode0
Towards Synthesizing Complex Programs from Input-Output Examples0
UCB Exploration via Q-Ensembles0
A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming0
Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known Dynamics0
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning0
Reinforcement Learning for Learning Rate Control0
The Atari Grand Challenge DatasetCode0
Sequential Dynamic Decision Making with Deep Neural Nets on a Test-Time Budget0
Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation ModelsCode0
Universal Reinforcement Learning Algorithms: Survey and ExperimentsCode0
Show:102550
← PrevPage 289 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified