SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 37763800 of 15113 papers

TitleStatusHype
Action Robust Reinforcement Learning and Applications in Continuous ControlCode0
Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning ProgramsCode0
Gap-Dependent Unsupervised Exploration for Reinforcement LearningCode0
CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic ScenarioCode0
Explicitly Encouraging Low Fractional Dimensional Trajectories Via Reinforcement LearningCode0
Reinforcement Learning with Success Induced Task PrioritizationCode0
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC GuaranteesCode0
Fully Parameterized Quantile Function for Distributional Reinforcement LearningCode0
Circular Microalgae-Based Carbon Control for Net ZeroCode0
Deep reinforcement learning in World-Earth system models to discover sustainable management strategiesCode0
Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use caseCode0
A policy gradient approach for Finite Horizon Constrained Markov Decision ProcessesCode0
Functional Acceleration for Policy Mirror DescentCode0
Relational Deep Reinforcement LearningCode0
Relational Graph Learning for Crowd NavigationCode0
From Two-Dimensional to Three-Dimensional Environment with Q-Learning: Modeling Autonomous Navigation with Reinforcement Learning and no LibrariesCode0
Hierarchical Potential-based Reward Shaping from Task SpecificationsCode0
Fully Convolutional Network with Multi-Step Reinforcement Learning for Image ProcessingCode0
Generalized Adaptive Transfer Network: Enhancing Transfer Learning in Reinforcement Learning Across DomainsCode0
APO: Enhancing Reasoning Ability of MLLMs via Asymmetric Policy OptimizationCode0
Replacing Rewards with Examples: Example-Based Policy Search via Recursive ClassificationCode0
Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted RegressionCode0
Replication of Impedance Identification Experiments on a Reinforcement-Learning-Controlled Digital Twin of Human ElbowsCode0
From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence PredictionCode0
From Gameplay to Symbolic Reasoning: Learning SAT Solver Heuristics in the Style of Alpha(Go) ZeroCode0
Show:102550
← PrevPage 152 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified