SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 82518275 of 15113 papers

TitleStatusHype
Obstacle Avoidance for UAS in Continuous Action Space Using Deep Reinforcement Learning0
Obtain Employee Turnover Rate and Optimal Reduction Strategy Based On Neural Network and Reinforcement Learning0
OCALM: Object-Centric Assessment with Language Models0
Occam's razor is insufficient to infer the preferences of irrational agents0
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search0
ODE-based Recurrent Model-free Reinforcement Learning for POMDPs0
ODGR: Online Dynamic Goal Recognition0
Off-Beat Multi-Agent Reinforcement Learning0
Off-dynamics Conditional Diffusion Planners0
Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain0
Offline and Distributional Reinforcement Learning for Radio Resource Management0
Offline and Distributional Reinforcement Learning for Wireless Communications0
Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration0
Offline Decentralized Multi-Agent Reinforcement Learning0
Offline Deep Reinforcement Learning for Dynamic Pricing of Consumer Credit0
Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data0
Offline Evaluation for Reinforcement Learning-based Recommendation: A Critical Issue and Some Alternatives0
Offline Fictitious Self-Play for Competitive Games0
Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies0
Offline Hierarchical Reinforcement Learning via Inverse Optimization0
Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization0
Offline Imitation Learning Through Graph Search and Retrieval0
Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare0
Offline Inverse Reinforcement Learning0
Offline Learning in Markov Games with General Function Approximation0
Show:102550
← PrevPage 331 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified