SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1480114850 of 15113 papers

TitleStatusHype
Model-Free Imitation Learning with Policy Optimization0
A PAC RL Algorithm for Episodic POMDPs0
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural NetworksCode0
Learning to Communicate with Deep Multi-Agent Reinforcement LearningCode0
Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition0
Option Discovery in Hierarchical Reinforcement Learning using Spatio-Temporal Clustering0
A Reinforcement Learning System to Encourage Physical Activity in Diabetes Patients0
Optimizing human-interpretable dialog management policy using Genetic Algorithm0
Avoiding Wireheading with Value Reinforcement Learning0
ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement LearningCode0
Classifying Options for Deep Reinforcement Learning0
Tournament selection in zeroth-level classifier systems based on average reward reinforcement learning0
Using Reinforcement Learning to Validate Empirical Game-Theoretic Analysis: A Continuous Double Auction Study0
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic MotivationCode0
Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics0
Theoretically-Grounded Policy Advice from Multiple Teachers in Reinforcement Learning Settings with Applications to Negative Transfer0
A statistical learning strategy for closed-loop control of fluid flows0
Data-Efficient Off-Policy Policy Evaluation for Reinforcement LearningCode0
Reinforcement learning based local search for grouping problems: A case study on graph coloring0
Algorithms for Batch Hierarchical Reinforcement Learning0
Negative Learning Rates and P-Learning0
Improving Information Extraction by Acquiring External Evidence with Reinforcement LearningCode0
Fully Convolutional Attention Networks for Fine-Grained Recognition0
Adaptive Parameter Selection in Evolutionary Algorithms by Reinforcement Learning with Dynamic Discretization of Parameter Range0
Feature Selection as a Multiagent Coordination Problem0
Exploratory Gradient Boosting for Reinforcement Learning in Complex DomainsCode0
A Signaling Game Approach to Databases Querying and Interaction0
Hierarchical Linearly-Solvable Markov Decision Problems0
Differentially Private Policy Evaluation0
Learning Shared Representations in Multi-task Reinforcement Learning0
Hierarchical Decision Making In Electricity Grid Management0
Reinforcement Learning of POMDPs using Spectral Methods0
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural NetworksCode0
Thompson Sampling is Asymptotically Optimal in General Environments0
Meta-learning within Projective Simulation0
Learning values across many orders of magnitude0
Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models0
Inverse Reinforcement Learning in Swarm Systems0
POMDP-lite for Robust Robot Planning under Uncertainty0
Reinforcement Learning approach for Real Time Strategy Games Battle city and S30
Deep Exploration via Bootstrapped DQNCode0
Value Iteration NetworksCode0
PAC Reinforcement Learning with Rich Observations0
Graying the black box: Understanding DQNs0
Data-Efficient Reinforcement Learning in Continuous-State POMDPs0
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks0
Active Information Acquisition0
Quantum machine learning with glow for episodic tasks and decision games0
Towards Resolving Unidentifiability in Inverse Reinforcement Learning0
SimpleDS: A Simple Deep Reinforcement Learning Dialogue SystemCode0
Show:102550
← PrevPage 297 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified