SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1185111900 of 15113 papers

TitleStatusHype
Sparse Skill Coding: Learning Behavioral Hierarchies with Sparse Codes0
Pre-training as Batch Meta Reinforcement Learning with tiMe0
Sequence-level Intrinsic Exploration Model for Partially Observable Domains0
Model-free Learning Control of Nonlinear Stochastic Systems with Stability Guarantee0
Counterfactual Regularization for Model-Based Reinforcement Learning0
Deep RL for Blood Glucose Control: Lessons, Challenges, and Opportunities0
BANANAS: Bayesian Optimization with Neural Networks for Neural Architecture Search0
Efficient meta reinforcement learning via meta goal generation0
Learning to Reason: Distilling Hierarchy via Self-Supervision and Reinforcement Learning0
Advantage Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning0
HIPPOCAMPAL NEURONAL REPRESENTATIONS IN CONTINUAL LEARNING0
Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning0
DeepAGREL: Biologically plausible deep learning via direct reinforcement0
Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning0
Assessing Generalization in TD methods for Deep Reinforcement Learning0
Behavior-Guided Reinforcement Learning0
Learning Key Steps to Attack Deep Reinforcement Learning Agents0
Learning by shaking: Computing policy gradients by physical forward-propagation0
Contextual Inverse Reinforcement Learning0
Adapt-to-Learn: Policy Transfer in Reinforcement Learning0
Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer0
Attention Privileged Reinforcement Learning for Domain Transfer0
CAPACITY-LIMITED REINFORCEMENT LEARNING: APPLICATIONS IN DEEP ACTOR-CRITIC METHODS FOR CONTINUOUS CONTROL0
AUGMENTED POLICY GRADIENT METHODS FOR EFFICIENT REINFORCEMENT LEARNING0
Consistent Meta-Reinforcement Learning via Model Identification and Experience Relabeling0
How many weights are enough : can tensor factorization learn efficient policies ?0
Improving Exploration of Deep Reinforcement Learning using Planning for Policy Search0
Do recent advancements in model-based deep reinforcement learning really improve data efficiency?0
CrossNorm: On Normalization for Off-Policy Reinforcement Learning0
Learning Good Policies By Learning Good Perceptual Models0
Generalizing Reinforcement Learning to Unseen Actions0
Learning Functionally Decomposed Hierarchies for Continuous Navigation Tasks0
Evo-NAS: Evolutionary-Neural Hybrid Agent for Architecture Search0
Learning Temporal Abstraction with Information-theoretic Constraints for Hierarchical Reinforcement Learning0
City Metro Network Expansion with Reinforcement Learning0
Learning to Reach Goals Without Reinforcement Learning0
Hope For The Best But Prepare For The Worst: Cautious Adaptation In RL Agents0
Learning World Graph Decompositions To Accelerate Reinforcement Learning0
Learning with Social Influence through Interior Policy Differentiation0
Event Discovery for History Representation in Reinforcement Learning0
Avoiding Negative Side-Effects and Promoting Safe Exploration with Imaginative Planning0
Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning0
Robust Domain Randomization for Reinforcement Learning0
Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning0
Trajectory representation learning for Multi-Task NMRDPs planning0
Policy Tree Network0
Training a Constrained Natural Media Painting Agent using Reinforcement Learning0
S2VG: Soft Stochastic Value Gradient method0
Subjective Reinforcement Learning for Open Complex Environments0
Mint: Matrix-Interleaving for Multi-Task Learning0
Show:102550
← PrevPage 238 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified