SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1030110350 of 15113 papers

TitleStatusHype
Now I Remember! Episodic Memory For Reinforcement Learning0
NROWAN-DQN: A Stable Noisy Network with Noise Reduction and Online Weight Adjustment for Exploration0
Nuclear Microreactor Control with Deep Reinforcement Learning0
NVCell: Standard Cell Layout in Advanced Technology Nodes with Reinforcement Learning0
Object-Category Aware Reinforcement Learning0
Object Exchangeability in Reinforcement Learning: Extended Abstract0
Objective-aware Traffic Simulation via Inverse Reinforcement Learning0
Object-oriented Neural Programming (OONP) for Document Understanding0
Object-sensitive Deep Reinforcement Learning0
Observational Learning by Reinforcement Learning0
Observational Overfitting in Reinforcement Learning0
Bounded Robustness in Reinforcement Learning via Lexicographic Objectives0
Observe and Look Further: Achieving Consistent Performance on Atari0
Observed Adversaries in Deep Reinforcement Learning0
Obstacle Avoidance for UAS in Continuous Action Space Using Deep Reinforcement Learning0
Obtain Employee Turnover Rate and Optimal Reduction Strategy Based On Neural Network and Reinforcement Learning0
OCALM: Object-Centric Assessment with Language Models0
Occam's razor is insufficient to infer the preferences of irrational agents0
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search0
ODE-based Recurrent Model-free Reinforcement Learning for POMDPs0
ODGR: Online Dynamic Goal Recognition0
Off-Beat Multi-Agent Reinforcement Learning0
Off-dynamics Conditional Diffusion Planners0
Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain0
Offline and Distributional Reinforcement Learning for Radio Resource Management0
Offline and Distributional Reinforcement Learning for Wireless Communications0
Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration0
Offline Decentralized Multi-Agent Reinforcement Learning0
Offline Deep Reinforcement Learning for Dynamic Pricing of Consumer Credit0
Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data0
Offline Evaluation for Reinforcement Learning-based Recommendation: A Critical Issue and Some Alternatives0
Offline Fictitious Self-Play for Competitive Games0
Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies0
Offline Hierarchical Reinforcement Learning via Inverse Optimization0
Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization0
Offline Imitation Learning Through Graph Search and Retrieval0
Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare0
Offline Inverse Reinforcement Learning0
Offline Learning in Markov Games with General Function Approximation0
Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning0
Offline Meta-level Model-based Reinforcement Learning Approach for Cold-Start Recommendation0
Offline Model-Based Reinforcement Learning with Anti-Exploration0
Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization0
Offline Multitask Representation Learning for Reinforcement Learning0
Offline Multi-task Transfer RL with Representational Penalization0
Offline-Online Reinforcement Learning: Extending Batch and Online RL0
Offline-Online Reinforcement Learning for Energy Pricing in Office Demand Response: Lowering Energy and Data Costs0
Offline Policy Evaluation and Optimization under Confounding0
Offline Policy Optimization in RL with Variance Regularizaton0
Offline Policy Optimization with Variance Regularization0
Show:102550
← PrevPage 207 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified