SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 97269750 of 15113 papers

TitleStatusHype
Learning from Demonstrations with Energy based Generative Adversarial Imitation Learning0
Incremental Policy Gradients for Online Reinforcement Learning Control0
Bounded Myopic Adversaries for Deep Reinforcement Learning Agents0
Learning Safe Policies with Cost-sensitive Advantage Estimation0
An Examination of Preference-based Reinforcement Learning for Treatment Recommendation0
A Robust Fuel Optimization Strategy For Hybrid Electric Vehicles: A Deep Reinforcement Learning Based Continuous Time Design Approach0
BRAC+: Going Deeper with Behavior Regularized Offline Reinforcement Learning0
Deep Reinforcement Learning With Adaptive Combined Critics0
Hindsight Curriculum Generation Based Multi-Goal Experience Replay0
Distributional Reinforcement Learning for Risk-Sensitive Policies0
Learning Efficient Planning-based Rewards for Imitation Learning0
Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning0
Entropic Risk-Sensitive Reinforcement Learning: A Meta Regret Framework with Function Approximation0
Discrete Predictive Representation for Long-horizon Planning0
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms0
A REINFORCEMENT LEARNING FRAMEWORK FOR TIME DEPENDENT CAUSAL EFFECTS EVALUATION IN A/B TESTING0
Learning a Transferable Scheduling Policy for Various Vehicle Routing Problems based on Graph-centric Representation Learning0
Learning to communicate through imagination with model-based deep multi-agent reinforcement learning0
Average Reward Reinforcement Learning with Monotonic Policy Improvement0
Error Controlled Actor-Critic Method to Reinforcement Learning0
Learning to Dynamically Select Between Reward Shaping Signals0
Daylight: Assessing Generalization Skills of Deep Reinforcement Learning Agents0
Learning to Explore with Pleasure0
Learning Active Learning in the Batch-Mode Setup with Ensembles of Active Learning Agents0
Learning to Observe with Reinforcement Learning0
Show:102550
← PrevPage 390 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified