SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1070110750 of 15113 papers

TitleStatusHype
Synthesizing Programmatic Policies that Inductively Generalize0
AMRL: Aggregated Memory For Reinforcement Learning0
Episodic Reinforcement Learning with Associative Memory0
Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning0
Learning Collaborative Agents with Rule Guidance for Knowledge Graph ReasoningCode1
Improving Robustness via Risk Averse Distributional Reinforcement Learning0
Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?0
Delay-aware Resource Allocation in Fog-assisted IoT Networks Through Reinforcement Learning0
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning0
GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning0
Improving Factual Consistency Between a Response and Persona Facts0
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging0
DSAC: Distributional Soft Actor Critic for Risk-Sensitive Reinforcement Learning0
Out-of-the-box channel pruned networks0
Towards Embodied Scene Description0
Reinforcement learning of minimalist grammars0
Plan-Space State Embeddings for Improved Reinforcement Learning0
Unsupervised Learning of KB Queries in Task-Oriented Dialogs0
Reinforcement Learning with Augmented DataCode1
Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations0
Whittle index based Q-learning for restless bandits with average reward0
Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks0
Molecular Design in Synthetically Accessible Chemical Space via Deep Reinforcement Learning0
Hierarchical Reinforcement Learning for Automatic Disease DiagnosisCode1
Actor-Critic Reinforcement Learning for Control with Stability GuaranteeCode1
Graph-based State Representation for Deep Reinforcement LearningCode0
Improving Sample Efficiency and Multi-Agent Communication in RL-based Train Rescheduling0
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from PixelsCode1
The Immersion of Directed Multi-graphs in Embedding Fields. Generalisations0
Transferable Active Grasping and Real Embodied DatasetCode1
Can We Learn Heuristics For Graphical Model Inference Using Reinforcement Learning?0
First return, then exploreCode1
Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement LearningCode1
Evolving Inborn Knowledge For Fast Adaptation in Dynamic POMDP ProblemsCode0
Adaptive model selection in photonic reservoir computing by reinforcement learning0
Age-Aware Status Update Control for Energy Harvesting IoT Sensors via Reinforcement Learning0
The Ingredients of Real-World Robotic Reinforcement Learning0
Reinforcement Learning Generalization with Surprise MinimizationCode0
Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement LearningCode1
A State Aggregation Approach for Solving Knapsack Problem with Deep Reinforcement Learning0
Curiosity-Driven Energy-Efficient Worker Scheduling in Vehicular Crowdsourcing: A Deep Reinforcement Learning ApproachCode1
CFR-RL: Traffic Engineering with Reinforcement Learning in SDNCode1
The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information BudgetCode2
PBCS : Efficient Exploration and Exploitation Using a Synergy between Reinforcement Learning and Motion Planning0
Self-Paced Deep Reinforcement LearningCode1
Automatic low-bit hybrid quantization of neural networks through meta learning0
Learning Dialog Policies from Weak Demonstrations0
Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning0
Correct Me If You Can: Learning from Error Corrections and MarkingsCode0
Guiding Robot Exploration in Reinforcement Learning via Automated Planning0
Show:102550
← PrevPage 215 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified