SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 41514175 of 15113 papers

TitleStatusHype
Recurrent Sum-Product-Max Networks for Decision Making in Perfectly-Observed EnvironmentsCode0
Towards Hyperparameter-free Policy Selection for Offline Reinforcement LearningCode0
Regret-Based Defense in Adversarial Reinforcement LearningCode0
DRIBO: Robust Deep Reinforcement Learning via Multi-View Information BottleneckCode0
Robust Distant Supervision Relation Extraction via Deep Reinforcement LearningCode0
Towards Interpretable Reinforcement Learning Using Attention Augmented AgentsCode0
MAD: A Magnitude And Direction Policy Parametrization for Stability Constrained Reinforcement LearningCode0
Robust Visual Domain Randomization for Reinforcement LearningCode0
Recursive generalized type-2 fuzzy radial basis function neural networks for joint position estimation and adaptive EMG-based impedance control of lower limb exoskeletonsCode0
Performative Reinforcement Learning in Gradually Shifting EnvironmentsCode0
Performing Deep Recurrent Double Q-Learning for Atari GamesCode0
Market Making via Reinforcement LearningCode0
Robust exploration in linear quadratic reinforcement learningCode0
Reinforcement Learning of Self Enhancing Camera Image and Signal ProcessingCode0
Towards Learning Transferable Conversational Skills using Multi-dimensional Dialogue ModellingCode0
Periodic Intra-Ensemble Knowledge Distillation for Reinforcement LearningCode0
On Catastrophic Interference in Atari 2600 GamesCode0
Meta Policy Learning for Cold-Start Conversational RecommendationCode0
Muscle Excitation Estimation in Biomechanical Simulation Using NAF Reinforcement LearningCode0
On Context Distribution Shift in Task Representation Learning for Offline Meta RLCode0
Towards Model-based Reinforcement Learning for Industry-near EnvironmentsCode0
Robust Inverse Reinforcement Learning under Transition Dynamics MismatchCode0
Robust Learning from Observation with Model MisspecificationCode0
Towards More Sample Efficiency in Reinforcement Learning with Data AugmentationCode0
MUSE: Modularizing Unsupervised Sense EmbeddingsCode0
Show:102550
← PrevPage 167 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified