SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 82268250 of 15113 papers

TitleStatusHype
No-regret Exploration in Shuffle Private Reinforcement Learning0
No-Regret Reinforcement Learning in Smooth MDPs0
No-Regret Reinforcement Learning with Heavy-Tailed Rewards0
Value Function Approximations via Kernel Embeddings for No-Regret Reinforcement Learning0
Normality-Guided Distributional Reinforcement Learning for Continuous Control0
NoRML: No-Reward Meta Learning0
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning0
NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds0
Novel Reinforcement Learning Algorithm for Suppressing Synchronization in Closed Loop Deep Brain Stimulators0
Novel Sensor Scheduling Scheme for Intruder Tracking in Energy Efficient Sensor Networks0
NovGrid: A Flexible Grid World for Evaluating Agent Response to Novelty0
Now I Remember! Episodic Memory For Reinforcement Learning0
NROWAN-DQN: A Stable Noisy Network with Noise Reduction and Online Weight Adjustment for Exploration0
Nuclear Microreactor Control with Deep Reinforcement Learning0
NVCell: Standard Cell Layout in Advanced Technology Nodes with Reinforcement Learning0
Object-Category Aware Reinforcement Learning0
Object Exchangeability in Reinforcement Learning: Extended Abstract0
Objective-aware Traffic Simulation via Inverse Reinforcement Learning0
Object-oriented Neural Programming (OONP) for Document Understanding0
Object-sensitive Deep Reinforcement Learning0
Observational Learning by Reinforcement Learning0
Observational Overfitting in Reinforcement Learning0
Bounded Robustness in Reinforcement Learning via Lexicographic Objectives0
Observe and Look Further: Achieving Consistent Performance on Atari0
Observed Adversaries in Deep Reinforcement Learning0
Show:102550
← PrevPage 330 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified