SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1295113000 of 15113 papers

TitleStatusHype
Predicting Research Trends From ArxivCode0
Concurrent Meta Reinforcement LearningCode0
A Hitchhiker's Guide to Statistical Comparisons of Reinforcement Learning AlgorithmsCode0
Continual Learning Using World Models for Pseudo-Rehearsal0
Safety-Guided Deep Reinforcement Learning via Online Gaussian Process Estimation0
Synthesizing Chemical Plant Operation Procedures using Knowledge, Dynamic Simulation and Deep Reinforcement Learning0
Minigo: A Case Study in Reproducing Reinforcement Learning Research0
Training in Task Space to Speed Up and Guide Reinforcement Learning0
simple_rl: Reproducible Reinforcement Learning in PythonCode0
Towards Understanding Chinese Checkers with Heuristics, Monte Carlo Tree Search, and Deep Reinforcement Learning0
Online Data Poisoning Attack0
Using Natural Language for Reward Shaping in Reinforcement LearningCode0
Viewpoint Optimization for Autonomous Strawberry Harvesting with Deep Reinforcement LearningCode0
Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future0
Hybrid Actor-Critic Reinforcement Learning in Parameterized Action SpaceCode0
Microscopic Traffic Simulation by Cooperative Multi-agent Deep Reinforcement Learning0
NoRML: No-Reward Meta Learning0
Budgeted Reinforcement Learning in Continuous State SpaceCode0
Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex EnvironmentsCode0
Hacking Google reCAPTCHA v3 using Reinforcement Learning0
A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning0
Discovering Options for Exploration by Minimizing Cover Time0
Efficient Reinforcement Learning for StarCraft by Abstract Forward Models and Transfer LearningCode0
Automating Predictive Modeling Process using Reinforcement Learning0
OmniDRL: Robust Pedestrian Detection using Deep Reinforcement Learning on Omnidirectional Cameras0
Straight to the point: reinforcement learning for user guidance in ultrasound0
Model-Based Reinforcement Learning for AtariCode0
TrojDRL: Trojan Attacks on Deep Reinforcement Learning AgentsCode0
Learning To Follow Directions in Street ViewCode0
Reinforcement Learning based Curriculum Optimization for Neural Machine Translation0
Unsupervised Attention Mechanism across Neural Network LayersCode0
Neural Packet Classification0
Unifying Ensemble Methods for Q-learning via Social Choice Theory0
Deep Reinforcement Learning for Adaptive Caching in Hierarchical Content Delivery Networks0
Distributed Edge Caching via Reinforcement Learning in Fog Radio Access Networks0
Introspection Learning0
Diagnosing Bottlenecks in Deep Q-learning AlgorithmsCode0
Can Meta-Interpretive Learning outperform Deep Reinforcement Learning of Evaluable Game strategies?0
Learning When Not to Answer: A Ternary Reward Structure for Reinforcement Learning based Question Answering0
Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies0
Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings0
S-TRIGGER: Continual State Representation Learning via Self-Triggered Generative Replay0
Long-Range Indoor Navigation with PRM-RL0
Learning Extreme Hummingbird Maneuvers on Flapping Wing Robots0
Adversarial Reinforcement Learning under Partial Observability in Autonomous Computer Network Defence0
Flappy Hummingbird: An Open Source Dynamic Simulation of Flapping Wing Robots and AnimalsCode0
Aggregating E-commerce Search Results from Heterogeneous Sources via Hierarchical Reinforcement Learning0
Distributionally Robust Reinforcement Learning0
A General Framework for Structured Learning of Mechanical SystemsCode0
Generative Memory for Lifelong Reinforcement Learning0
Show:102550
← PrevPage 260 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified