SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 81768200 of 15113 papers

TitleStatusHype
Neurosymbolic Reinforcement Learning and Planning: A Survey0
Neuro-Symbolic Reinforcement Learning with First-Order Logic0
Neuro-Symbolic World Models for Adapting to Open World Novelty0
NeuSaver: Neural Adaptive Power Consumption Optimization for Mobile Video Streaming0
Never too Prim to Swim: An LLM-Enhanced RL-based Adaptive S-Surface Controller for AUVs under Extreme Sea Conditions0
New Auction Algorithms for Path Planning, Network Transport, and Reinforcement Learning0
New Challenges in Reinforcement Learning: A Survey of Security and Privacy0
New Reinforcement Learning Using a Chaotic Neural Network for Emergence of "Thinking" - "Exploration" Grows into "Thinking" through Learning -0
News-based trading strategies0
Next-Future: Sample-Efficient Policy Learning for Robotic-Arm Tasks0
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs0
NIL: No-data Imitation Learning by Leveraging Pre-trained Video Diffusion Models0
No DBA? No regret! Multi-armed bandits for index tuning of analytical and HTAP workloads with provable guarantees0
Node Injection Attacks on Graphs via Reinforcement Learning0
Noise as a Double-Edged Sword: Reinforcement Learning Exploits Randomized Defenses in Neural Networks0
Noise-based reward-modulated learning0
Noise Distribution Decomposition based Multi-Agent Distributional Reinforcement Learning0
Some approaches used to overcome overestimation in Deep Reinforcement Learning algorithms0
Noise Pollution in Hospital Readmission Prediction: Long Document Classification with Reinforcement Learning0
Noise-Robust End-to-End Quantum Control using Deep Autoregressive Policy Networks0
Noisy Agents: Self-supervised Exploration by Predicting Auditory Events0
Noisy Spiking Actor Network for Exploration0
No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL0
Non-asymptotic Analysis of Biased Stochastic Approximation Scheme0
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling0
Show:102550
← PrevPage 328 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified