| A Technique to Create Weaker Abstract Board Game Agents via Reinforcement Learning | Sep 1, 2022 | Board GamesQ-Learning | —Unverified | 0 |
| Partial Counterfactual Identification for Infinite Horizon Partially Observable Markov Decision Process | Aug 31, 2022 | counterfactualQ-Learning | —Unverified | 0 |
| Direct Data-Driven Discrete-time Bilinear Biquadratic Regulator | Aug 29, 2022 | Q-Learning | —Unverified | 0 |
| Goal-Conditioned Q-Learning as Knowledge Distillation | Aug 28, 2022 | Knowledge DistillationQ-Learning | CodeCode Available | 0 |
| Object Goal Navigation using Data Regularized Q-Learning | Aug 27, 2022 | Data AugmentationDeep Reinforcement Learning | —Unverified | 0 |
| Prospect Theory-inspired Automated P2P Energy Trading with Q-learning-based Dynamic Pricing | Aug 26, 2022 | energy tradingQ-Learning | —Unverified | 0 |
| Recurrent Neural Network-based Anti-jamming Framework for Defense Against Multiple Jamming Policies | Aug 19, 2022 | Q-Learning | —Unverified | 0 |
| A Novel Resource Allocation for Anti-jamming in Cognitive-UAVs: an Active Inference Approach | Aug 10, 2022 | Bayesian InferenceQ-Learning | —Unverified | 0 |
| Compositional Reinforcement Learning for Discrete-Time Stochastic Control Systems | Aug 6, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Reinforcement Learning for Joint V2I Network Selection and Autonomous Driving Policies | Aug 3, 2022 | Autonomous DrivingAutonomous Vehicles | —Unverified | 0 |
| Digital Twin-Assisted Efficient Reinforcement Learning for Edge Task Scheduling | Aug 2, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Mitigating Off-Policy Bias in Actor-Critic Methods with One-Step Q-learning: A Novel Correction Approach | Aug 1, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 |
| A Maintenance Planning Framework using Online and Offline Deep Reinforcement Learning | Aug 1, 2022 | Asset ManagementDeep Reinforcement Learning | —Unverified | 0 |
| Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning | Jul 28, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Structural Similarity for Improved Transfer in Reinforcement Learning | Jul 27, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Finite-Time Analysis of Asynchronous Q-learning under Diminishing Step-Size from Control-Theoretic View | Jul 25, 2022 | Q-Learning | —Unverified | 0 |
| On Decentralizing Federated Reinforcement Learning in Multi-Robot Scenarios | Jul 19, 2022 | Federated LearningQ-Learning | —Unverified | 0 |
| Multi-Source AoI-Constrained Resource Minimization under HARQ: Heterogeneous Sampling Processes | Jul 19, 2022 | Q-LearningScheduling | —Unverified | 0 |
| DDPG Learning for Aerial RIS-Assisted MU-MISO Communications | Jul 13, 2022 | Q-Learning | —Unverified | 0 |
| Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing | Jul 13, 2022 | Q-Learning | —Unverified | 0 |
| Multi-objective Optimization of Notifications Using Offline Reinforcement Learning | Jul 7, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Planning with RL and episodic-memory behavioral priors | Jul 5, 2022 | Imitation LearningQ-Learning | —Unverified | 0 |
| q-Learning in Continuous Time | Jul 2, 2022 | Learning TheoryQ-Learning | —Unverified | 0 |
| Interactive Learning from Natural Language and Demonstrations using Signal Temporal Logic | Jul 1, 2022 | Formal LogicQ-Learning | —Unverified | 0 |
| Action-modulated midbrain dopamine activity arises from distributed control policies | Jul 1, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Predicting the Need for Blood Transfusion in Intensive Care Units with Reinforcement Learning | Jun 26, 2022 | Decision MakingQ-Learning | —Unverified | 0 |
| Reinforcement Learning under Partial Observability Guided by Learned Environment Models | Jun 23, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Recursive Reinforcement Learning | Jun 23, 2022 | IngenuityQ-Learning | —Unverified | 0 |
| Federated Stochastic Approximation under Markov Noise and Heterogeneity: Applications in Reinforcement Learning | Jun 21, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| The Integration of Machine Learning into Automated Test Generation: A Systematic Mapping Study | Jun 21, 2022 | BIG-bench Machine LearningQ-Learning | —Unverified | 0 |
| Visual Radial Basis Q-Network | Jun 14, 2022 | Q-LearningReinforcement Learning (RL) | —Unverified | 0 |
| RL-GA: A Reinforcement Learning-Based Genetic Algorithm for Electromagnetic Detection Satellite Scheduling Problem | Jun 12, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Cooperation between Independent Market Makers | Jun 11, 2022 | Q-Learning | CodeCode Available | 0 |
| An Optimization Method-Assisted Ensemble Deep Reinforcement Learning Algorithm to Solve Unit Commitment Problems | Jun 9, 2022 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| A Study of Continual Learning Methods for Q-Learning | Jun 8, 2022 | Continual LearningQ-Learning | —Unverified | 0 |
| Concentration bounds for SSP Q-learning for average cost MDPs | Jun 7, 2022 | Q-Learning | —Unverified | 0 |
| Introspective Experience Replay: Look Back When Surprised | Jun 7, 2022 | Q-Learningreinforcement-learning | CodeCode Available | 0 |
| DeepTPI: Test Point Insertion with Deep Reinforcement Learning | Jun 7, 2022 | Deep Reinforcement LearningGraph Neural Network | CodeCode Available | 0 |
| Balancing Profit, Risk, and Sustainability for Portfolio Management | Jun 6, 2022 | ManagementPortfolio Optimization | —Unverified | 0 |
| DDPG based on multi-scale strokes for financial time series trading strategy | Jun 5, 2022 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| CoNSoLe: Convex Neural Symbolic Learning | Jun 1, 2022 | Q-Learning | —Unverified | 0 |
| Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning | Jun 1, 2022 | Q-Learning | —Unverified | 0 |
| Graph Backup: Data Efficient Backup Exploiting Markovian Transitions | May 31, 2022 | Atari Gamescounterfactual | CodeCode Available | 0 |
| GraMeR: Graph Meta Reinforcement Learning for Multi-Objective Influence Maximization | May 30, 2022 | Computational EfficiencyMarketing | —Unverified | 0 |
| Designing Rewards for Fast Learning | May 30, 2022 | Q-LearningReinforcement Learning (RL) | —Unverified | 0 |
| Deep Reinforcement Learning for Distributed and Uncoordinated Cognitive Radios Resource Allocation | May 27, 2022 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Does DQN Learn? | May 26, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment | May 26, 2022 | Multi-Armed BanditsQ-Learning | —Unverified | 0 |
| An Experimental Comparison Between Temporal Difference and Residual Gradient with Neural Network Approximation | May 25, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Analytics of Business Time Series Using Machine Learning and Bayesian Inference | May 25, 2022 | Bayesian InferenceBIG-bench Machine Learning | —Unverified | 0 |