| GenCos' Behaviors Modeling Based on Q Learning Improved by Dichotomy | Aug 4, 2020 | Q-Learning | —Unverified | 0 |
| Cooperative Control of Mobile Robots with Stackelberg Learning | Aug 3, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| QPLEX: Duplex Dueling Multi-Agent Q-Learning | Aug 3, 2020 | Decision MakingMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Momentum Q-learning with Finite-Sample Convergence Guarantee | Jul 30, 2020 | Q-Learning | —Unverified | 0 |
| Deep Reinforcement Learning for Dynamic Spectrum Sensing and Aggregation in Multi-Channel Wireless Networks | Jul 28, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient | Jul 25, 2020 | Q-Learningreinforcement-learning | —Unverified | 0 |
| A Comparative Study of AI-based Intrusion Detection Techniques in Critical Infrastructures | Jul 24, 2020 | Intrusion DetectionManagement | —Unverified | 0 |
| EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL | Jul 21, 2020 | D4RLDecision Making | —Unverified | 0 |
| Trade-off on Sim2Real Learning: Real-world Learning Faster than Simulations | Jul 21, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| A Machine Learning Approach for Task and Resource Allocation in Mobile Edge Computing Based Networks | Jul 20, 2020 | BIG-bench Machine LearningEdge-computing | —Unverified | 0 |
| Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense | Jul 20, 2020 | Multi-agent Reinforcement LearningQ-Learning | —Unverified | 0 |
| Same-Day Delivery with Fairness | Jul 19, 2020 | FairnessQ-Learning | —Unverified | 0 |
| Meta-Gradient Reinforcement Learning with an Objective Discovered Online | Jul 16, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Reinforcement Learning-Enabled Decision-Making Strategies for a Vehicle-Cyber-Physical-System in Connected Environment | Jul 16, 2020 | Autonomous VehiclesDecision Making | —Unverified | 0 |
| DRIFT: Deep Reinforcement Learning for Functional Software Testing | Jul 16, 2020 | Deep Reinforcement LearningGraph Neural Network | —Unverified | 0 |
| PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning | Jul 16, 2020 | Policy Gradient MethodsQ-Learning | CodeCode Available | 0 |
| Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent | Jul 15, 2020 | Atari GamesQ-Learning | —Unverified | 0 |
| Qgraph-bounded Q-learning: Stabilizing Model-Free Off-Policy Deep Reinforcement Learning | Jul 15, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Single-partition adaptive Q-learning | Jul 14, 2020 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 0 |
| Revisiting Fundamentals of Experience Replay | Jul 13, 2020 | Deep Reinforcement LearningDQN Replay Dataset | CodeCode Available | 0 |
| SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning | Jul 9, 2020 | Deep Reinforcement LearningDiversity | CodeCode Available | 1 |
| The Mean-Squared Error of Double Q-Learning | Jul 9, 2020 | Q-Learning | CodeCode Available | 0 |
| Neural Interactive Collaborative Filtering | Jul 4, 2020 | Collaborative FilteringMeta-Learning | CodeCode Available | 1 |
| Reward Machines for Cooperative Multi-Agent Reinforcement Learning | Jul 3, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Hedging using reinforcement learning: Contextual k-Armed Bandit versus Q-learning | Jul 3, 2020 | FrictionQ-Learning | —Unverified | 0 |
| Group Equivariant Deep Reinforcement Learning | Jul 1, 2020 | Deep Reinforcement LearningInductive Bias | CodeCode Available | 0 |
| Regularly Updated Deterministic Policy Gradient Algorithm | Jul 1, 2020 | MuJoCoQ-Learning | —Unverified | 0 |
| Gradient Temporal-Difference Learning with Regularized Corrections | Jul 1, 2020 | Q-Learning | CodeCode Available | 1 |
| Provably More Efficient Q-Learning in the One-Sided-Feedback/Full-Feedback Settings | Jun 30, 2020 | Q-Learning | —Unverified | 0 |
| Using Reinforcement Learning to Herd a Robotic Swarm to a Target Distribution | Jun 29, 2020 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Concept and the implementation of a tool to convert industry 4.0 environments modeled as FSM to an OpenAI Gym wrapper | Jun 29, 2020 | OpenAI GymQ-Learning | —Unverified | 0 |
| Image Classification by Reinforcement Learning with Two-State Q-Learning | Jun 28, 2020 | ClassificationGeneral Classification | CodeCode Available | 1 |
| Reinforcement Learning Based Handwritten Digit Recognition with Two-State Q-Learning | Jun 28, 2020 | BenchmarkingHandwritten Digit Recognition | —Unverified | 0 |
| Lookahead-Bounded Q-Learning | Jun 28, 2020 | Q-Learning | CodeCode Available | 0 |
| Active Finite Reward Automaton Inference and Reinforcement Learning Using Queries and Counterexamples | Jun 28, 2020 | Active LearningDeep Reinforcement Learning | —Unverified | 0 |
| Offline Contextual Bandits with Overparameterized Models | Jun 27, 2020 | Multi-Armed BanditsQ-Learning | CodeCode Available | 0 |
| Q-Learning with Differential Entropy of Q-Tables | Jun 26, 2020 | Q-Learning | —Unverified | 0 |
| Deep Q-Network-Driven Catheter Segmentation in 3D US by Hybrid Constrained Semi-Supervised Learning and Dual-UNet | Jun 25, 2020 | Q-Learning | —Unverified | 0 |
| Unified Reinforcement Q-Learning for Mean Field Game and Control Problems | Jun 24, 2020 | Q-LearningReinforcement Learning (RL) | —Unverified | 0 |
| Energy Minimization in UAV-Aided Networks: Actor-Critic Learning for Constrained Scheduling Optimization | Jun 24, 2020 | Combinatorial OptimizationDeep Reinforcement Learning | —Unverified | 0 |
| Preventing Value Function Collapse in Ensemble Q-Learning by Maximizing Representation Diversity | Jun 24, 2020 | DiversityQ-Learning | —Unverified | 0 |
| Deep Reinforcement Learning Control for Radar Detection and Tracking in Congested Spectral Environments | Jun 23, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret | Jun 22, 2020 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Near-Optimal Reinforcement Learning with Self-Play | Jun 22, 2020 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Hybridizing the 1/5-th Success Rule with Q-Learning for Controlling the Mutation Rate of an Evolutionary Algorithm | Jun 19, 2020 | Evolutionary AlgorithmsQ-Learning | —Unverified | 0 |
| Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning | Jun 18, 2020 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 1 |
| Parameterized MDPs and Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework | Jun 17, 2020 | Decision MakingQ-Learning | —Unverified | 0 |
| Semantic Visual Navigation by Watching YouTube Videos | Jun 17, 2020 | Q-LearningVisual Navigation | CodeCode Available | 1 |
| Q-learning with Logarithmic Regret | Jun 16, 2020 | Q-Learning | —Unverified | 0 |
| The Sample Complexity of Teaching-by-Reinforcement on Q-Learning | Jun 16, 2020 | Q-Learningreinforcement-learning | —Unverified | 0 |