SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 62516275 of 15113 papers

TitleStatusHype
Strategies for Using Proximal Policy Optimization in Mobile Puzzle Games0
Strategising template-guided needle placement for MR-targeted prostate biopsy0
Strategy and Benchmark for Converting Deep Q-Networks to Event-Driven Spiking Neural Networks0
Stratified Experience Replay: Correcting Multiplicity Bias in Off-Policy Reinforcement Learning0
Stratified Expert Cloning with Adaptive Selection for User Retention in Large-Scale Recommender Systems0
Stratospheric Aerosol Injection as a Deep Reinforcement Learning Problem0
Streaming Linear System Identification with Reverse Experience Replay0
Streaming Traffic Flow Prediction Based on Continuous Reinforcement Learning0
StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation0
Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning0
S-TRIGGER: Continual State Representation Learning via Self-Triggered Generative Replay0
Striving for Simplicity in Off-Policy Deep Reinforcement Learning0
Strongly-polynomial time and validation analysis of policy gradient methods0
Structural Credit Assignment in Neural Networks using Reinforcement Learning0
Structural Credit Assignment with Coordinated Exploration0
Structural Return Maximization for Reinforcement Learning0
Structural Similarity for Improved Transfer in Reinforcement Learning0
Structure-aware reinforcement learning for node-overload protection in mobile edge computing0
Structure-Aware Transformer Policy for Inhomogeneous Multi-Task Reinforcement Learning0
Structured Dialogue Policy with Graph Neural Networks0
Structured Graph Network for Constrained Robot Crowd Navigation with Low Fidelity Simulation0
Structured Reinforcement Learning for Delay-Optimal Data Transmission in Dense mmWave Networks0
Structured World Belief for Reinforcement Learning in POMDP0
Structure-Enhanced Deep Reinforcement Learning for Optimal Transmission Scheduling0
Structure in Deep Reinforcement Learning: A Survey and Open Problems0
Show:102550
← PrevPage 251 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified