SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1355113600 of 15113 papers

TitleStatusHype
Natural Language Person Search Using Deep Reinforcement Learning0
Part-Activated Deep Reinforcement Learning for Action Prediction0
Snap Angle Prediction for 360° Panoramas0
Goal-Oriented Visual Question Generation via Intermediate Rewards0
Collaborative Deep Reinforcement Learning for Multi-Object Tracking0
Deep Reinforcement Learning with Iterative Shift for Visual Tracking0
Dual-Agent Deep Reinforcement Learning for Deformable Face Tracking0
A Contextual-bandit-based Approach for Informed Decision-making in Clinical Trials0
Directed Exploration in PAC Model-Free Reinforcement Learning0
Ensemble Sequence Level Training for Multimodal MT: OSU-Baidu WMT18 Multimodal Machine Translation System Report0
APES: a Python toolbox for simulating reinforcement learning environmentsCode0
ExIt-OOS: Towards Learning from Planning in Imperfect Information GamesCode0
Application of Self-Play Reinforcement Learning to a Four-Player Game of Imperfect InformationCode0
A Reinforcement Learning-driven Translation Model for Search-Oriented Conversational Systems0
Learning a Policy for Opportunistic Active Learning0
APRIL: Interactively Learning to Summarise by Combining Active Preference Learning and Reinforcement LearningCode0
Cycle-of-Learning for Autonomous Systems from Human InteractionCode0
High-confidence error estimates for learned value functions0
SOLAR: Deep Structured Representations for Model-Based Reinforcement LearningCode0
Optimal control of eye-movements during visual search0
A Study of Reinforcement Learning for Neural Machine TranslationCode0
NavigationNet: A Large-scale Interactive Indoor Navigation Dataset0
Proximal Policy Optimization and its Dynamic Version for Sequence Generation0
Playing 20 Question Game with Policy-Based Reinforcement Learning0
Exploring Shared Structures and Hierarchies for Multiple NLP Tasks0
LIFT: Reinforcement Learning in Computer Systems by Learning From DemonstrationsCode0
Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement LearningCode0
Catastrophic Importance of Catastrophic Forgetting0
Goal-oriented Dialogue Policy Learning from Failures0
Source-Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language0
Reinforcement Learning for Autonomous Defence in Software-Defined Networking0
Importance mixing: Improving sample reuse in evolutionary policy search methods0
Data Poisoning Attacks in Contextual Bandits0
Context-Aware Visual Policy Network for Sequence-Level Image CaptioningCode0
Deep RTS: A Game Environment for Deep Reinforcement Learning in Real-Time Strategy GamesCode0
Incorporating Consistency Verification into Neural Data-to-Document Generation0
Directed Policy Gradient for Safe Reinforcement Learning with Human Advice0
A Framework for Automated Cellular Network Tuning with Reinforcement LearningCode0
Visual Sensor Network Reconfiguration with Deep Reinforcement Learning0
End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning0
Policy Optimization as Wasserstein Gradient Flows0
Regret Bounds for Reinforcement Learning via Markov Chain Concentration0
An Efficient Deep Reinforcement Learning Model for Urban Traffic ControlCode0
Learning to Share and Hide Intentions using Information RegularizationCode0
RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online AdvertisingCode0
Structured Dialogue Policy with Graph Neural Networks0
Source Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language0
Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless NetworksCode0
Neural Math Word Problem Solver with Reinforcement Learning0
A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators0
Show:102550
← PrevPage 272 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified