SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1170111750 of 15113 papers

TitleStatusHype
Which Channel to Ask My Question? Personalized Customer Service RequestStream Routing using DeepReinforcement Learning0
Scaling active inference0
Corpus-Level End-to-End Exploration for Interactive SystemsCode0
Dynamic Control of a Fiber Manufacturing Process using Deep Reinforcement LearningCode0
Iteratively-Refined Interactive 3D Medical Image Segmentation with Multi-Agent Reinforcement Learning0
From Persistent Homology to Reinforcement Learning with Applications for Retail Banking0
Fleet Control using Coregionalized Gaussian Process Policy IterationCode0
Analysis of Evolutionary Behavior in Self-Learning Media Search Engines0
Graph Pruning for Model Compression0
Deep Reinforcement Learning for Trading0
DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement LearningCode0
Information-Theoretic Confidence Bounds for Reinforcement Learning0
Efficient Drone Mobility Support Using Reinforcement Learning0
Accelerating Reinforcement Learning with Suboptimal Guidance0
Agent Probing Interaction Policies0
Memory-Efficient Episodic Control Reinforcement Learning with Dynamic Online k-meansCode0
State Alignment-based Imitation Learning0
Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic ControlCode0
Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning0
Safe Policies for Reinforcement Learning via Primal-Dual Methods0
On Policy Learning Robust to Irreversible Events: An Application to Robotic In-Hand Manipulation0
Deep Reinforcement Learning in Cryptocurrency Market Making0
Hierarchical Average Reward Policy Gradient Algorithms0
Avoiding Jammers: A Reinforcement Learning Approach0
A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound0
Corruption-robust exploration in episodic reinforcement learning0
Bayesian Curiosity for Efficient Exploration in Reinforcement LearningCode0
Decision Making for Autonomous Driving via Augmented Adversarial Inverse Reinforcement Learning0
Attention-Privileged Reinforcement Learning0
Generalizable Resource Allocation in Stream Processing via Deep Reinforcement LearningCode0
Efficient decorrelation of features using Gramian in Reinforcement Learning0
MANGA: Method Agnostic Neural-policy Generalization and Adaptation0
Variance Reduced Advantage Estimation with δ Hindsight Credit Assignment0
Placement Optimization of Aerial Base Stations with Deep Reinforcement Learning0
Planning with Goal-Conditioned PoliciesCode0
Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation0
Influence-aware Memory Architectures for Deep Reinforcement LearningCode0
Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning0
Comments on the Du-Kakade-Wang-Yang Lower Bounds0
Efficient Exploration through Intrinsic Motivation Learning for Unsupervised Subgoal Discovery in Model-Free Hierarchical Reinforcement Learning0
IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation TasksCode0
Hebbian Synaptic Modifications in Spiking Neurons that Learn0
Generalized Maximum Causal Entropy for Inverse Reinforcement Learning0
Inverse Reinforcement Learning with Missing Data0
Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance0
Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in HealthcareCode0
Off-Policy Policy Gradient Algorithms by Constraining the State Distribution Shift0
Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning0
Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient0
Empirical Study of Off-Policy Policy Evaluation for Reinforcement LearningCode0
Show:102550
← PrevPage 235 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified