SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 66016650 of 15113 papers

TitleStatusHype
Do Artificial Reinforcement-Learning Agents Matter Morally?0
Do as I can, not as I get0
Do Autonomous Agents Benefit from Hearing?0
DOB-Net: Actively Rejecting Unknown Excessive Time-Varying Disturbances0
Document-editing Assistants and Model-based Reinforcement Learning as a Path to Conversational AI0
Do Deep Reinforcement Learning Algorithms really Learn to Navigate?0
Does Explicit Prediction Matter in Deep Reinforcement Learning-Based Energy Management?0
How Does an Approximate Model Help in Reinforcement Learning?0
Does Sparsity Help in Learning Misspecified Linear Bandits?0
Domain Adaptation for Deep Reinforcement Learning in Visually Distinct Games0
Domain Adaptation for Offline Reinforcement Learning with Limited Samples0
Domain Adaptation for Reinforcement Learning on the Atari0
Domain Adaptation of Reinforcement Learning Agents based on Network Service Proximity0
DOMAIN ADAPTATION VIA DISTRIBUTION AND REPRESENTATION MATCHING: A CASE STUDY ON TRAINING DATA SELECTION VIA REINFORCEMENT LEARNING0
Domain Adapting Deep Reinforcement Learning for Real-world Speech Emotion Recognition0
Domain Adaptive Fake News Detection via Reinforcement Learning0
Domain Adversarial Reinforcement Learning0
Domain Adversarial Reinforcement Learning for Partial Domain Adaptation0
Domain Generalization for Robust Model-Based Offline Reinforcement Learning0
Domain-Independent Optimistic Initialization for Reinforcement Learning0
Domain Knowledge-Based Automated Analog Circuit Design with Deep Reinforcement Learning0
Domain Knowledge Integration By Gradient Matching For Sample-Efficient Reinforcement Learning0
DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning0
Domain Randomization for Robust, Affordable and Effective Closed-loop Control of Soft Robots0
Domain Randomization via Entropy Maximization0
Dominion: A New Frontier for AI Research0
Done Is Better than Perfect: Unlocking Efficient Reasoning by Structured Multi-Turn Decomposition0
Do No Harm: A Counterfactual Approach to Safe Reinforcement Learning0
Don't do it: Safer Reinforcement Learning With Rule-based Guidance0
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL0
Don't Forget Your Teacher: A Corrective Reinforcement Learning Framework0
Don't Get Yourself into Trouble! Risk-aware Decision-Making for Autonomous Vehicles0
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning0
Don't Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation0
DOOM: A Novel Adversarial-DRL-Based Op-Code Level Metamorphic Malware Obfuscator for the Enhancement of IDS0
DOP: Deep Optimistic Planning with Approximate Value Function Evaluation0
Do recent advancements in model-based deep reinforcement learning really improve data efficiency?0
Importance of using appropriate baselines for evaluation of data-efficiency in deep reinforcement learning for Atari0
Dot-to-Dot: Explainable Hierarchical Reinforcement Learning for Robotic Manipulation0
Double A3C: Deep Reinforcement Learning on OpenAI Gym Games0
Double Deep Q Networks for Sensor Management in Space Situational Awareness0
Double Meta-Learning for Data Efficient Policy Optimization in Non-Stationary Environments0
Double Q(σ) and Q(σ, λ): Unifying Reinforcement Learning Control Algorithms0
Double Q-learning0
Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation0
Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning0
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning0
DPO: A Differential and Pointwise Control Approach to Reinforcement Learning0
DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a Real Steam Turbine System0
DQNAS: Neural Architecture Search using Reinforcement Learning0
Show:102550
← PrevPage 133 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified