SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 62016225 of 15113 papers

TitleStatusHype
Statistical CSI-based Beamforming for RIS-Aided Multiuser MISO Systems using Deep Reinforcement Learning0
Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayes Theory0
Statistical Inference After Adaptive Sampling for Longitudinal Data0
Statistically Model Checking PCTL Specifications on Markov Decision Processes via Reinforcement Learning0
Statistics and Samples in Distributional Reinforcement Learning0
Learning Skills to Navigate without a Master: A Sequential Multi-Policy Reinforcement Learning Algorithm0
Steady State Analysis of Episodic Reinforcement Learning0
Steady-State Error Compensation for Reinforcement Learning with Quadratic Rewards0
Stealing Deep Reinforcement Learning Models for Fun and Profit0
Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning0
Steering LLM Reasoning Through Bias-Only Adaptation0
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning0
Steering Your Diffusion Policy with Latent Space Reinforcement Learning0
Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning0
Stein Variational Policy Gradient0
Stepping Out of the Shadows: Reinforcement Learning in Shadow Mode0
Step-wise Adaptive Integration of Supervised Fine-tuning and Reinforcement Learning for Task-Specific LLMs0
Stigmergic Independent Reinforcement Learning for Multi-Agent Collaboration0
Stochastically Dominant Distributional Reinforcement Learning0
Stochastic Approximation of Gaussian Free Energy for Risk-Sensitive Reinforcement Learning0
Stochastic Approximation with Markov Noise: Analysis and applications in reinforcement learning0
Stochastic Constraint Programming as Reinforcement Learning0
Stochastic convex optimization for provably efficient apprenticeship learning0
Stochastic evolution in populations of ideas0
Stochastic Gradient Descent with Dependent Data for Offline Reinforcement Learning0
Show:102550
← PrevPage 249 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified