SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1345113500 of 15113 papers

TitleStatusHype
Dual-Agent Deep Reinforcement Learning for Deformable Face Tracking0
Deep Reinforcement Learning with Iterative Shift for Visual Tracking0
Part-Activated Deep Reinforcement Learning for Action Prediction0
Snap Angle Prediction for 360° Panoramas0
A Contextual-bandit-based Approach for Informed Decision-making in Clinical Trials0
Ensemble Sequence Level Training for Multimodal MT: OSU-Baidu WMT18 Multimodal Machine Translation System Report0
APES: a Python toolbox for simulating reinforcement learning environmentsCode0
Directed Exploration in PAC Model-Free Reinforcement Learning0
Multi-Hop Knowledge Graph Reasoning with Reward ShapingCode1
Application of Self-Play Reinforcement Learning to a Four-Player Game of Imperfect InformationCode0
ExIt-OOS: Towards Learning from Planning in Imperfect Information GamesCode0
A Reinforcement Learning-driven Translation Model for Search-Oriented Conversational Systems0
APRIL: Interactively Learning to Summarise by Combining Active Preference Learning and Reinforcement LearningCode0
Learning a Policy for Opportunistic Active Learning0
Decoupling Strategy and Generation in Negotiation DialoguesCode1
Adversarial Deep Reinforcement Learning in Portfolio ManagementCode1
Optimal control of eye-movements during visual search0
SOLAR: Deep Structured Representations for Model-Based Reinforcement LearningCode0
High-confidence error estimates for learned value functions0
Cycle-of-Learning for Autonomous Systems from Human InteractionCode0
A Study of Reinforcement Learning for Neural Machine TranslationCode0
NavigationNet: A Large-scale Interactive Indoor Navigation Dataset0
Proximal Policy Optimization and its Dynamic Version for Sequence Generation0
Reinforcement Learning for Relation Classification from Noisy DataCode1
Playing 20 Question Game with Policy-Based Reinforcement Learning0
LIFT: Reinforcement Learning in Computer Systems by Learning From DemonstrationsCode0
Exploring Shared Structures and Hierarchies for Multiple NLP Tasks0
Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement LearningCode0
Goal-oriented Dialogue Policy Learning from Failures0
Catastrophic Importance of Catastrophic Forgetting0
Source-Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language0
Reinforcement Learning for Autonomous Defence in Software-Defined Networking0
Importance mixing: Improving sample reuse in evolutionary policy search methods0
Data Poisoning Attacks in Contextual Bandits0
Context-Aware Visual Policy Network for Sequence-Level Image CaptioningCode0
Incorporating Consistency Verification into Neural Data-to-Document Generation0
Deep RTS: A Game Environment for Deep Reinforcement Learning in Real-Time Strategy GamesCode0
A Framework for Automated Cellular Network Tuning with Reinforcement LearningCode0
Directed Policy Gradient for Safe Reinforcement Learning with Human Advice0
Visual Sensor Network Reconfiguration with Deep Reinforcement Learning0
End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning0
Policy Optimization as Wasserstein Gradient Flows0
Learning to Share and Hide Intentions using Information RegularizationCode0
An Efficient Deep Reinforcement Learning Model for Urban Traffic ControlCode0
Regret Bounds for Reinforcement Learning via Markov Chain Concentration0
RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online AdvertisingCode0
Source Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language0
Interpretable Rationale Augmented Charge Prediction System0
Distantly Supervised NER with Partial Annotation Learning and Reinforcement LearningCode0
A New Concept of Deep Reinforcement Learning based Augmented General Tagging System0
Show:102550
← PrevPage 270 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified