SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 35263550 of 15113 papers

TitleStatusHype
Combining Reinforcement Learning and Optimal Transport for the Traveling Salesman ProblemCode0
ARAML: A Stable Adversarial Training Framework for Text GenerationCode0
Adversarial Environment Generation for Learning to Navigate the WebCode0
Arachnophobia Exposure Therapy using Experience-driven Procedural Content Generation via Reinforcement Learning (EDPCGRL)Code0
Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement LearningCode0
Grammars and reinforcement learning for molecule optimizationCode0
GRAC: Self-Guided and Self-Regularized Actor-CriticCode0
Gotta Learn Fast: A New Benchmark for Generalization in RLCode0
Deep Reinforcement Learning-based Exploration of Web ApplicationsCode0
Accelerating Reinforcement Learning through GPU Atari EmulationCode0
Google Research Football: A Novel Reinforcement Learning EnvironmentCode0
A Quadratic Actor Network for Model-Free Reinforcement LearningCode0
Gossip-based Actor-Learner Architectures for Deep Reinforcement LearningCode0
GoSum: Extractive Summarization of Long Documents by Reinforcement Learning and Graph Organized discourse stateCode0
Guided Dialog Policy Learning without Adversarial Learning in the LoopCode0
PIPPS: Flexible Model-Based Policy Search Robust to the Curse of ChaosCode0
Goal-Conditioned Q-Learning as Knowledge DistillationCode0
Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement LearningCode0
Goal-conditioned Imitation LearningCode0
Active exploration in parameterized reinforcement learningCode0
Goal Recognition as Reinforcement LearningCode0
Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision ProcessesCode0
Combining Automated Optimisation of Hyperparameters and Reward ShapeCode0
Combined Reinforcement Learning via Abstract RepresentationsCode0
GLIB: Efficient Exploration for Relational Model-Based Reinforcement Learning via Goal-Literal BabblingCode0
Show:102550
← PrevPage 142 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified