SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1435114400 of 15113 papers

TitleStatusHype
Deep Q-Learning for Self-Organizing Networks Fault Management and Radio Performance Improvement0
Emergence of Locomotion Behaviours in Rich EnvironmentsCode1
Learning human behaviors from motion capture by adversarial imitationCode0
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control0
The Complex Negotiation Dialogue Game0
Hindsight Experience ReplayCode1
Learning to Design Games: Strategic Environments in Reinforcement Learning0
OPEB: Open Physical Environment Benchmark for Artificial Intelligence0
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy GamesCode0
Maintaining cooperation in complex social dilemmas using deep reinforcement learning0
Efficient Probabilistic Performance Bounds for Inverse Reinforcement LearningCode0
Hashing over Predicted Future Frames for Informed Exploration of Deep Reinforcement Learning0
Grammatical Error Correction with Neural Reinforcement Learning0
Action-Decision Networks for Visual Tracking With Deep Reinforcement LearningCode0
Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management0
Neural Sequence Model Training via α-divergence MinimizationCode0
A Deep Reinforcement Learning Framework for the Financial Portfolio Management ProblemCode1
Noisy Networks for ExplorationCode0
Path Integral Networks: End-to-End Differentiable Optimal Control0
Learning to Learn: Meta-Critic Networks for Sample Efficient Learning0
Actor-Critic Sequence Training for Image Captioning0
Neural SLAM: Learning to Explore with External MemoryCode0
Interpretability via Model Extraction0
Uncertainty Decomposition in Bayesian Neural Networks with Latent Variables0
Count-Based Exploration in Feature Space for Reinforcement LearningCode0
Temporal-related Convolutional-Restricted-Boltzmann-Machine capable of learning relational order via reinforcement learning procedure?0
A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement LearningCode0
Structure Learning in Motor Control:A Deep Reinforcement Learning Model0
Observational Learning by Reinforcement Learning0
Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines0
Toward Real-Time Decentralized Reinforcement Learning using Finite Support Basis Functions0
Data-Efficient Reinforcement Learning with Probabilistic Model Predictive ControlCode0
Dex: Incremental Learning for Complex Environments in Deep Reinforcement LearningCode0
Pedestrian Prediction by Planning using Deep Neural Networks0
Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning0
Value-Decomposition Networks For Cooperative Multi-Agent LearningCode1
Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equationsCode0
Reinforcement Learning under Model Mismatch0
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement LearningCode0
Reinforcement Learning with Budget-Constrained Nonparametric Function Approximation for Opportunistic Spectrum Access0
On Optimistic versus Randomized Exploration in Reinforcement Learning0
Device Placement Optimization with Reinforcement LearningCode0
Hybrid Reward Architecture for Reinforcement LearningCode0
Deep reinforcement learning from human preferencesCode0
ACCNet: Actor-Coordinator-Critic Net for "Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning0
Symmetry Learning for Function Approximation in Reinforcement Learning0
Unlocking the Potential of Simulators: Design with RL in Mind0
Efficient Reinforcement Learning via Initial Pure Exploration0
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive EnvironmentsCode1
Parameter Space Noise for ExplorationCode0
Show:102550
← PrevPage 288 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified