SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 58515875 of 15113 papers

TitleStatusHype
Self-critical Sequence Training for Automatic Speech Recognition0
Self-Driving Car Racing: Application of Deep Reinforcement Learning0
Self-driving scale car trained by Deep reinforcement learning0
Self-Driving Telescopes: Autonomous Scheduling of Astronomical Observation Campaigns with Offline Reinforcement Learning0
Self-evolving Autoencoder Embedded Q-Network0
Self-Evolving Curriculum for LLM Reasoning0
Self-Imitation Advantage Learning0
Self-Imitation Learning by Planning0
Self-Imitation Learning from Demonstrations0
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning0
Self-Inspection Method of Unmanned Aerial Vehicles in Power Plants Using Deep Q-Network Reinforcement Learning0
Self-Learned Formula Synthesis in Set Theory0
Self-Learning Tuning for Post-Silicon Validation0
Self-optimizing adaptive optics control with Reinforcement Learning for high-contrast imaging0
Self-organization in a distributed coordination game through heuristic rules0
Self-Organizing Maps as a Storage and Transfer Mechanism in Reinforcement Learning0
Self-Organizing Maps for Storage and Transfer of Knowledge in Reinforcement Learning0
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation0
Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games0
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models0
Self-Supervised Continuous Control without Policy Gradient0
Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning0
Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning0
Relevance-Guided Modeling of Object Dynamics for Reinforcement Learning0
Self-Supervised Reinforcement Learning for Recommender Systems0
Show:102550
← PrevPage 235 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified