SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 97019725 of 15113 papers

TitleStatusHype
Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning0
Self-Supervised Continuous Control without Policy Gradient0
Robust Multi-Agent Reinforcement Learning Driven by Correlated Equilibrium0
Re-examining Routing Networks for Multi-task Learning0
Scalable Bayesian Inverse Reinforcement Learning by Auto-Encoding Reward0
PAC-Bayesian Randomized Value Function with Informative Prior0
MQES: Max-Q Entropy Search for Efficient Exploration in Continuous Reinforcement Learning0
On Trade-offs of Image Prediction in Visual Model-Based Reinforcement Learning0
Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates0
Structure and randomness in planning and reinforcement learningCode0
Understanding and Leveraging Causal Relations in Deep Reinforcement Learning0
Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks0
RECONNAISSANCE FOR REINFORCEMENT LEARNING WITH SAFETY CONSTRAINTS0
When Is Generalizable Reinforcement Learning Tractable?0
R-LAtte: Attention Module for Visual Control via Reinforcement Learning0
What are the Statistical Limits of Batch RL with Linear Function Approximation?0
Regioned Episodic Reinforcement Learning0
Fine-Tuning Offline Reinforcement Learning with Model-Based Policy Optimization0
Aspect-based Sentiment Classification via Reinforcement Learning0
A Simple Sparse Denoising Layer for Robust Deep Learning0
Learning Latent Landmarks for Generalizable Planning0
Coordinated Multi-Agent Exploration Using Shared Goals0
FSV: Learning to Factorize Soft Value Function for Cooperative Multi-Agent Reinforcement Learning0
Cross-State Self-Constraint for Feature Generalization in Deep Reinforcement Learning0
Learning Predictive Communication by Imagination in Networked System Control0
Show:102550
← PrevPage 389 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified