SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1102611050 of 15113 papers

TitleStatusHype
Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?0
Improving Robustness via Risk Averse Distributional Reinforcement Learning0
Exploration in Reinforcement Learning with Deep Covering Options0
Episodic Reinforcement Learning with Associative Memory0
Learning Efficient Parameter Server Synchronization Policies for Distributed SGD0
Synthesizing Programmatic Policies that Inductively Generalize0
Model Based Reinforcement Learning for Atari0
Model-based reinforcement learning for biological sequence design0
Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control0
Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information0
The Ingredients of Real World Robotic Reinforcement Learning0
Reinforcement learning of minimalist grammars0
Unsupervised Learning of KB Queries in Task-Oriented Dialogs0
Towards Embodied Scene Description0
Out-of-the-box channel pruned networks0
Plan-Space State Embeddings for Improved Reinforcement Learning0
DSAC: Distributional Soft Actor Critic for Risk-Sensitive Reinforcement Learning0
GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning0
Delay-aware Resource Allocation in Fog-assisted IoT Networks Through Reinforcement Learning0
Improving Factual Consistency Between a Response and Persona Facts0
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging0
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning0
Graph-based State Representation for Deep Reinforcement LearningCode0
Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations0
Whittle index based Q-learning for restless bandits with average reward0
Show:102550
← PrevPage 442 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified