SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 33763400 of 15113 papers

TitleStatusHype
Deep Adaptive Multi-Intention Inverse Reinforcement LearningCode0
Compositional Conservatism: A Transductive Approach in Offline Reinforcement LearningCode0
Gym-Ignition: Reproducible Robotic Simulations for Reinforcement LearningCode0
Actively Learning Costly Reward Functions for Reinforcement LearningCode0
Composable Deep Reinforcement Learning for Robotic ManipulationCode0
Complex Model Transformations by Reinforcement Learning with Uncertain Human GuidanceCode0
Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object ClassificationCode0
A Reinforcement Learning Approach for Performance-aware Reduction in Power Consumption of Data Center Compute NodesCode0
DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPsCode0
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson SamplingCode0
Augmented Q Imitation Learning (AQIL)Code0
GuideLight: "Industrial Solution" Guidance for More Practical Traffic Signal Control AgentsCode0
Guided Policy Optimization under Partial ObservabilityCode0
Noisy Zero-Shot Coordination: Breaking The Common Knowledge Assumption In Zero-Shot Coordination GamesCode0
Reinforcement learning with non-ergodic reward increments: robustness via ergodicity transformationsCode0
Guiding Evolutionary Strategies by Differentiable Robot SimulatorsCode0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
Harnessing Structures for Value-Based Planning and Reinforcement LearningCode0
Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented DialogCode0
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal DemonstrationsCode0
Guided Cooperation in Hierarchical Reinforcement Learning via Model-based RolloutCode0
Guided Deep Reinforcement Learning for Swarm SystemsCode0
Guided Dialog Policy Learning without Adversarial Learning in the LoopCode0
Competing for pixels: a self-play algorithm for weakly-supervised segmentationCode0
gTLO: A Generalized and Non-linear Multi-Objective Deep Reinforcement Learning ApproachCode0
Show:102550
← PrevPage 136 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified