SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 52515275 of 15113 papers

TitleStatusHype
Implementations that Matter in Cooperative Multi-Agent Reinforcement Learning0
Implementing Online Reinforcement Learning with Temporal Neural Networks0
Implementing Reinforcement Learning Algorithms in Retail Supply Chains with OpenAI Gym Toolkit0
Implications of Human Irrationality for Reinforcement Learning0
Implicitly Regularized RL with Implicit Q-Values0
Implicit Neural-Representation Learning for Elastic Deformable-Object Manipulations0
Implicit Offline Reinforcement Learning via Supervised Learning0
Implicit Policy for Reinforcement Learning0
Importance mixing: Improving sample reuse in evolutionary policy search methods0
Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning0
Importance of Environment Design in Reinforcement Learning: A Study of a Robotic Environment0
Importance Sampling-Guided Meta-Training for Intelligent Agents in Highly Interactive Environments0
Importance Weighted Evolution Strategies0
Importance Weighted Policy Learning and Adaptation0
Importance Weighted Transfer of Samples in Reinforcement Learning0
Imposing Robust Structured Control Constraint on Reinforcement Learning of Linear Quadratic Regulator0
Improper Reinforcement Learning with Gradient-based Policy Optimization0
Improved Activity Forecasting for Generating Trajectories0
Provably Improved Context-Based Offline Meta-RL with Attention and Contrastive Learning0
Improved cooperation by balancing exploration and exploitation in intertemporal social dilemma tasks0
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning0
Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient0
Improved Learning in Evolution Strategies via Sparser Inter-Agent Network Topologies0
Improved Learning of Robot Manipulation Tasks via Tactile Intrinsic Motivation0
Improved Memories Learning0
Show:102550
← PrevPage 211 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified