SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 42014225 of 15113 papers

TitleStatusHype
Towards Scalable Verification of Deep Reinforcement LearningCode0
Towards Similarity Graphs Constructed by Deep Reinforcement LearningCode0
Reinforcement Learning with Dual-Observation for General Video Game PlayingCode0
Robust Reinforcement Learning in Continuous Control Tasks with Uncertainty Set RegularizationCode0
Robust Reinforcement Learning Objectives for Sequential Recommender SystemsCode0
Towards Solving Text-based Games by Producing Adaptive Action SpacesCode0
Towards Symbolic Reinforcement Learning with Common SenseCode0
Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic ProgrammingCode0
Towards the Use of Deep Reinforcement Learning with Global Policy For Query-based Extractive SummarisationCode0
Monte Carlo Q-learning for General Game PlayingCode0
Robust Reinforcement Learning Under Minimax Regret for Green SecurityCode0
Robust Reinforcement Learning under model misspecificationCode0
Robust Reinforcement Learning via Adversarial training with Langevin DynamicsCode0
Regret Minimization Experience Replay in Off-Policy Reinforcement LearningCode0
Regret Minimization for Partially Observable Deep Reinforcement LearningCode0
Robust Reinforcement Learning with Dynamic Distortion Risk MeasuresCode0
Robust Representation Learning by Clustering with Bisimulation Metrics for Visual Reinforcement Learning with DistractionsCode0
Toybox: A Suite of Environments for Experimental Evaluation of Deep Reinforcement LearningCode0
ToyBox: Better Atari Environments for Testing Reinforcement Learning AgentsCode0
Regret Minimization for Reinforcement Learning with Vectorial Feedback and Complex ObjectivesCode0
Integrating Distributed Architectures in Highly Modular RL LibrariesCode0
Unsupervised Attention Mechanism across Neural Network LayersCode0
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and PracticeCode0
Tracking Object Positions in Reinforcement Learning: A Metric for Keypoint Detection (extended version)Code0
Regularization Matters in Policy OptimizationCode0
Show:102550
← PrevPage 169 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified