SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 66766700 of 15113 papers

TitleStatusHype
Defending Observation Attacks in Deep Reinforcement Learning via Detection and DenoisingCode0
FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks0
Deep Reinforcement Learning for Exact Combinatorial Optimization: Learning to Branch0
Computation Offloading and Resource Allocation in F-RANs: A Federated Deep Reinforcement Learning Approach0
Intrinsically motivated option learning: a comparative study of recent methods0
IGN : Implicit Generative NetworksCode0
Analysis of Randomization Effects on Sim2Real Transfer in Reinforcement Learning for Robotic Manipulation Tasks0
Provable Benefit of Multitask Representation Learning in Reinforcement Learning0
Relative Policy-Transition Optimization for Fast Policy Transfer0
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward0
RL-GA: A Reinforcement Learning-Based Genetic Algorithm for Electromagnetic Detection Satellite Scheduling Problem0
Matching options to tasks using Option-Indexed Hierarchical Reinforcement Learning0
Case-Based Inverse Reinforcement Learning Using Temporal CoherenceCode0
Deep Reinforcement Learning for Optimal Investment and Saving Strategy Selection in Heterogeneous Profiles: Intelligent Agents working towards retirement0
Federated Offline Reinforcement Learning0
Does Self-supervised Learning Really Improve Reinforcement Learning from Pixels?Code0
An application of neural networks to a problem in knot theory and group theory (untangling braids)0
Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based on Maximum Entropy0
Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement LearningCode0
Large-Scale Retrieval for Reinforcement Learning0
Social Network Structure Shapes Innovation: Experience-sharing in RL with SAPIENS0
Multifidelity Reinforcement Learning with Control Variates0
Policy Gradient Reinforcement Learning for Uncertain Polytopic LPV Systems based on MHE-MPC0
Regret Bounds for Information-Directed Reinforcement Learning0
Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information0
Show:102550
← PrevPage 268 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified