SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1285112900 of 15113 papers

TitleStatusHype
Introspection Learning0
Distributed Edge Caching via Reinforcement Learning in Fog Radio Access Networks0
Diagnosing Bottlenecks in Deep Q-learning AlgorithmsCode0
Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies0
Learning When Not to Answer: A Ternary Reward Structure for Reinforcement Learning based Question Answering0
Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings0
Can Meta-Interpretive Learning outperform Deep Reinforcement Learning of Evaluable Game strategies?0
Flappy Hummingbird: An Open Source Dynamic Simulation of Flapping Wing Robots and AnimalsCode0
Adversarial Reinforcement Learning under Partial Observability in Autonomous Computer Network Defence0
Learning Extreme Hummingbird Maneuvers on Flapping Wing Robots0
S-TRIGGER: Continual State Representation Learning via Self-Triggered Generative Replay0
Long-Range Indoor Navigation with PRM-RL0
Aggregating E-commerce Search Results from Heterogeneous Sources via Hierarchical Reinforcement Learning0
Distributionally Robust Reinforcement Learning0
A General Framework for Structured Learning of Mechanical SystemsCode0
Generative Memory for Lifelong Reinforcement Learning0
Learning Deterministic Policy with Target for Power Control in Wireless Networks0
Statistics and Samples in Distributional Reinforcement Learning0
From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following0
Curiosity-Driven Experience Prioritization via Density Estimation0
Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPsCode0
Deep Reinforcement Learning using Genetic Algorithm for Parameter OptimizationCode0
DOM-Q-NET: Grounded RL on Structured LanguageCode0
A novel repetition normalized adversarial reward for headline generation0
Emergent Coordination Through Competition0
Hyperbolic Discounting and Learning over Multiple HorizonsCode0
Investigating Generalisation in Continuous Deep Reinforcement Learning0
Parenting: Safe Reinforcement Learning from Human Input0
Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement Learning0
A new Potential-Based Reward Shaping for Reinforcement Learning Agent0
Heuristics, Answer Set Programming and Markov Decision Process for Solving a Set of Spatial PuzzlesCode0
Leveraging Communication Topologies Between Learning Agents in Deep Reinforcement Learning0
Asynchronous Coagent Networks0
Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic0
Robust Reinforcement Learning in POMDPs with Incomplete and Noisy Observations0
Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement LearningCode0
Unsupervised Visuomotor Control through Distributional Planning NetworksCode0
Verifiably Safe Off-Model Reinforcement LearningCode1
CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and SimplicityCode1
Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning0
Reinforcement Learning for UA V Attitude Control0
Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems0
Preferences Implicit in the State of the WorldCode0
ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning0
Deep Reinforcement Learning from Policy-Dependent Human Feedback0
Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous FlightCode0
Latent Space Reinforcement Learning for Steering Angle Prediction0
Performance Dynamics and Termination Errors in Reinforcement Learning: A Unifying Perspective0
WiseMove: A Framework for Safe Deep Reinforcement Learning for Autonomous Driving0
Stochastic Reinforcement Learning0
Show:102550
← PrevPage 258 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified