SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1030110325 of 15113 papers

TitleStatusHype
Now I Remember! Episodic Memory For Reinforcement Learning0
NROWAN-DQN: A Stable Noisy Network with Noise Reduction and Online Weight Adjustment for Exploration0
Nuclear Microreactor Control with Deep Reinforcement Learning0
NVCell: Standard Cell Layout in Advanced Technology Nodes with Reinforcement Learning0
Object-Category Aware Reinforcement Learning0
Object Exchangeability in Reinforcement Learning: Extended Abstract0
Objective-aware Traffic Simulation via Inverse Reinforcement Learning0
Object-oriented Neural Programming (OONP) for Document Understanding0
Object-sensitive Deep Reinforcement Learning0
Observational Learning by Reinforcement Learning0
Observational Overfitting in Reinforcement Learning0
Bounded Robustness in Reinforcement Learning via Lexicographic Objectives0
Observe and Look Further: Achieving Consistent Performance on Atari0
Observed Adversaries in Deep Reinforcement Learning0
Obstacle Avoidance for UAS in Continuous Action Space Using Deep Reinforcement Learning0
Obtain Employee Turnover Rate and Optimal Reduction Strategy Based On Neural Network and Reinforcement Learning0
OCALM: Object-Centric Assessment with Language Models0
Occam's razor is insufficient to infer the preferences of irrational agents0
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search0
ODE-based Recurrent Model-free Reinforcement Learning for POMDPs0
ODGR: Online Dynamic Goal Recognition0
Off-Beat Multi-Agent Reinforcement Learning0
Off-dynamics Conditional Diffusion Planners0
Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain0
Offline and Distributional Reinforcement Learning for Radio Resource Management0
Show:102550
← PrevPage 413 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified