SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1027610300 of 15113 papers

TitleStatusHype
Learning Intrinsic Symbolic Rewards in Reinforcement Learning0
Maximum Reward Formulation In Reinforcement LearningCode0
Nonstationary Reinforcement Learning with Linear Function Approximation0
Provable Fictitious Play for General Mean-Field Games0
Regularized Inverse Reinforcement Learning0
Reinforcement Learning for Many-Body Ground-State Preparation Inspired by Counterdiabatic Driving0
Online Safety Assurance for Deep Reinforcement Learning0
Variational Intrinsic Control Revisited0
Model-Free Non-Stationary RL: Near-Optimal Regret and Applications in Multi-Agent RL and Inventory Control0
Actor-Critic Algorithm for High-dimensional Partial Differential Equations0
Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective0
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited0
Learning Diverse Options via InfoMax Termination CriticCode0
Heterogeneous Multi-Agent Reinforcement Learning for Unknown Environment Mapping0
Safety Aware Reinforcement Learning (SARL)0
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning0
Sentiment Analysis for Reinforcement Learning0
Meta-Learning of Structured Task Distributions in Humans and MachinesCode0
Policy Learning Using Weak SupervisionCode0
The act of remembering: a study in partially observable reinforcement learning0
Deep Reinforcement Learning for Electric Vehicle Routing Problem with Time Windows0
Deep Reinforcement Learning for Collaborative Edge Computing in Vehicular Networks0
Learning to Generalize for Sequential Decision MakingCode0
A Distributed Model-Free Ride-Sharing Approach for Joint Matching, Pricing, and Dispatching using Deep Reinforcement Learning0
Goal-directed Generation of Discrete Structures with Conditional Generative Models0
Show:102550
← PrevPage 412 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified