SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 35013525 of 15113 papers

TitleStatusHype
Automated Gadget Discovery in ScienceCode0
Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLMCode0
Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement LearningCode0
Combining Reinforcement Learning and Optimal Transport for the Traveling Salesman ProblemCode0
ARAML: A Stable Adversarial Training Framework for Text GenerationCode0
Pac-Man Pete: An extensible framework for building AI in VEX RoboticsCode0
Automated Image Data Preprocessing with Deep Reinforcement LearningCode0
Adversarial Environment Generation for Learning to Navigate the WebCode0
Parameter-Based Value FunctionsCode0
Parameter-free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy GradientsCode0
Arachnophobia Exposure Therapy using Experience-driven Procedural Content Generation via Reinforcement Learning (EDPCGRL)Code0
Accelerating Reinforcement Learning through GPU Atari EmulationCode0
GRAC: Self-Guided and Self-Regularized Actor-CriticCode0
Grammars and reinforcement learning for molecule optimizationCode0
Gossip-based Actor-Learner Architectures for Deep Reinforcement LearningCode0
GoSum: Extractive Summarization of Long Documents by Reinforcement Learning and Graph Organized discourse stateCode0
Automated quantum programming via reinforcement learning for combinatorial optimizationCode0
Google Research Football: A Novel Reinforcement Learning EnvironmentCode0
Gotta Learn Fast: A New Benchmark for Generalization in RLCode0
A Quadratic Actor Network for Model-Free Reinforcement LearningCode0
Goal Recognition as Reinforcement LearningCode0
Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement LearningCode0
Graph Backup: Data Efficient Backup Exploiting Markovian TransitionsCode0
gTLO: A Generalized and Non-linear Multi-Objective Deep Reinforcement Learning ApproachCode0
Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision ProcessesCode0
Show:102550
← PrevPage 141 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified