| Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP | Jan 27, 2019 | Q-LearningReinforcement Learning | —Unverified | 0 |
| Provably efficient RL with Rich Observations via Latent State Decoding | Jan 25, 2019 | ClusteringQ-Learning | CodeCode Available | 0 |
| Combinational Q-Learning for Dou Di Zhu | Jan 24, 2019 | Atari GamesCard Games | CodeCode Available | 0 |
| Reinforcement Learning of Markov Decision Processes with Peak Constraints | Jan 23, 2019 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Distillation Strategies for Proximal Policy Optimization | Jan 23, 2019 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Understanding Multi-Step Deep Reinforcement Learning: A Systematic Study of the DQN Target | Jan 22, 2019 | Deep Reinforcement LearningQ-Learning | CodeCode Available | 0 |
| A Deep Recurrent Q Network towards Self-adapting Distributed Microservices architecture | Jan 13, 2019 | Decision MakingQ-Learning | CodeCode Available | 0 |
| Deep Reinforcement Learning for Imbalanced Classification | Jan 5, 2019 | ClassificationDecision Making | CodeCode Available | 0 |
| Accelerating Goal-Directed Reinforcement Learning by Model Characterization | Jan 4, 2019 | modelModel-based Reinforcement Learning | —Unverified | 0 |
| Optimal Decision-Making in Mixed-Agent Partially Observable Stochastic Environments via Reinforcement Learning | Jan 4, 2019 | Decision MakingImage Segmentation | —Unverified | 0 |
| Adversarial Learning of a Sampler Based on an Unnormalized Distribution | Jan 3, 2019 | FormQ-Learning | CodeCode Available | 0 |
| A Theoretical Analysis of Deep Q-Learning | Jan 1, 2019 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Information-Directed Exploration for Deep Reinforcement Learning | Dec 18, 2018 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 0 |
| Reinforcement Learning for Adaptive Caching with Dynamic Storage Pricing | Dec 17, 2018 | Decision MakingQ-Learning | —Unverified | 0 |
| Double Deep Q-Learning for Optimal Execution | Dec 17, 2018 | Q-Learning | —Unverified | 0 |
| Learning Sharing Behaviors with Arbitrary Numbers of Agents | Dec 10, 2018 | Q-Learning | —Unverified | 0 |
| A new multilayer optical film optimal method based on deep q-learning | Dec 7, 2018 | Q-Learning | —Unverified | 0 |
| Active Deep Q-learning with Demonstration | Dec 6, 2018 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Revisiting the Softmax Bellman Operator: New Benefits and New Perspective | Dec 2, 2018 | Atari GamesQ-Learning | CodeCode Available | 0 |
| Non-delusional Q-learning and value-iteration | Dec 1, 2018 | Q-Learning | —Unverified | 0 |
| Urban Driving with Multi-Objective Deep Reinforcement Learning | Nov 21, 2018 | Autonomous DrivingDeep Reinforcement Learning | CodeCode Available | 0 |
| Reinforcement Learning with A* and a Deep Heuristic | Nov 19, 2018 | Q-Learningreinforcement-learning | CodeCode Available | 0 |
| Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning | Nov 19, 2018 | Active LearningQ-Learning | CodeCode Available | 0 |
| Emergence of Addictive Behaviors in Reinforcement Learning Agents | Nov 14, 2018 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Deep Q learning for fooling neural networks | Nov 13, 2018 | Q-LearningReinforcement Learning | CodeCode Available | 0 |
| Managing App Install Ad Campaigns in RTB: A Q-Learning Approach | Nov 11, 2018 | Q-Learning | —Unverified | 0 |
| Deep Reinforcement Learning for Green Security Games with Real-Time Information | Nov 6, 2018 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Deep Reinforcement Learning via L-BFGS Optimization | Nov 6, 2018 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 |
| Reinforcement Learning based Dynamic Model Selection for Short-Term Load Forecasting | Nov 5, 2018 | BIG-bench Machine LearningLoad Forecasting | —Unverified | 0 |
| Approximate Dynamic Oracle for Dependency Parsing with Reinforcement Learning | Nov 1, 2018 | Dependency ParsingImitation Learning | —Unverified | 0 |
| Double Q-PID algorithm for mobile robot control | Nov 1, 2018 | Active LearningQ-Learning | CodeCode Available | 0 |
| Structure Learning of Deep Neural Networks with Q-Learning | Oct 31, 2018 | image-classificationImage Classification | —Unverified | 0 |
| Distributive Dynamic Spectrum Access through Deep Reinforcement Learning: A Reservoir Computing Based Approach | Oct 28, 2018 | BIG-bench Machine LearningDeep Reinforcement Learning | —Unverified | 0 |
| Multi-Agent Reinforcement Learning Based Resource Allocation for UAV Networks | Oct 24, 2018 | Multi-agent Reinforcement LearningQ-Learning | —Unverified | 0 |
| Learning Negotiating Behavior Between Cars in Intersections using Deep Q-Learning | Oct 24, 2018 | Q-Learning | —Unverified | 0 |
| Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement | Oct 22, 2018 | Policy Gradient MethodsQ-Learning | CodeCode Available | 0 |
| Finding the best design parameters for optical nanostructures using reinforcement learning | Oct 18, 2018 | BIG-bench Machine LearningQ-Learning | —Unverified | 0 |
| Assessing the Potential of Classical Q-learning in General Game Playing | Oct 14, 2018 | Board GamesDeep Reinforcement Learning | CodeCode Available | 0 |
| Learning to Sketch with Deep Q Networks and Demonstrated Strokes | Oct 14, 2018 | Q-Learning | —Unverified | 0 |
| Learning to Reason | Oct 12, 2018 | Automated Theorem ProvingQ-Learning | —Unverified | 0 |
| Reinforcement Evolutionary Learning Method for self-learning | Oct 7, 2018 | Incremental LearningMarketing | —Unverified | 0 |
| Scaling All-Goals Updates in Reinforcement Learning Using Convolutional Neural Networks | Oct 6, 2018 | AllMontezuma's Revenge | CodeCode Available | 0 |
| Deep Quality-Value (DQV) Learning | Sep 30, 2018 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 0 |
| Reinforcement Learning in R | Sep 29, 2018 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Accelerated Value Iteration via Anderson Mixing | Sep 27, 2018 | Atari GamesQ-Learning | —Unverified | 0 |
| A Convergent Variant of the Boltzmann Softmax Operator in Reinforcement Learning | Sep 27, 2018 | Atari GamesQ-Learning | —Unverified | 0 |
| Hybrid Policies Using Inverse Rewards for Reinforcement Learning | Sep 27, 2018 | OpenAI GymQ-Learning | —Unverified | 0 |
| Convergent Reinforcement Learning with Function Approximation: A Bilevel Optimization Perspective | Sep 27, 2018 | Bilevel OptimizationQ-Learning | —Unverified | 0 |
| What Would pi* Do?: Imitation Learning via Off-Policy Reinforcement Learning | Sep 27, 2018 | Imitation LearningQ-Learning | —Unverified | 0 |
| The wisdom of the crowd: reliable deep reinforcement learning through ensembles of Q-functions | Sep 27, 2018 | Deep Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 |