| Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding | Mar 12, 2024 | Multi-Agent Path FindingMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning | Mar 12, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Finite-Time Error Analysis of Soft Q-Learning: Switching System Approach | Mar 11, 2024 | Q-Learning | —Unverified | 0 |
| Scalable Online Exploration via Coverability | Mar 11, 2024 | Efficient ExplorationQ-Learning | CodeCode Available | 0 |
| Algorithmic Collusion and Price Discrimination: The Over-Usage of Data | Mar 10, 2024 | Q-Learning | —Unverified | 0 |
| Enhancing Classification Performance via Reinforcement Learning for Feature Selection | Mar 9, 2024 | Classificationfeature selection | —Unverified | 0 |
| Belief-Enriched Pessimistic Q-Learning against Adversarial State Perturbations | Mar 6, 2024 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 0 |
| SMAUG: A Sliding Multidimensional Task Window-Based MARL Framework for Adaptive Real-Time Subtask Recognition | Mar 4, 2024 | Hierarchical Reinforcement LearningMulti-agent Reinforcement Learning | —Unverified | 0 |
| Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning | Mar 2, 2024 | DecoderMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| QF-tuner: Breaking Tradition in Reinforcement Learning | Feb 26, 2024 | OpenAI GymQ-Learning | —Unverified | 0 |
| SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning | Feb 20, 2024 | Imitation LearningQ-Learning | CodeCode Available | 0 |
| Reinforcement Learning for Optimal Execution when Liquidity is Time-Varying | Feb 19, 2024 | Q-Learningreinforcement-learning | —Unverified | 0 |
| An Index Policy Based on Sarsa and Q-learning for Heterogeneous Smart Target Tracking | Feb 19, 2024 | Q-LearningScheduling | —Unverified | 0 |
| Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization | Feb 19, 2024 | counterfactualOpenAI Gym | —Unverified | 0 |
| Finite-Time Error Analysis of Online Model-Based Q-Learning with a Relaxed Sampling Model | Feb 19, 2024 | modelQ-Learning | —Unverified | 0 |
| Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling | Feb 19, 2024 | AvgMulti-agent Reinforcement Learning | —Unverified | 0 |
| Reinforcement learning to maximise wind turbine energy generation | Feb 17, 2024 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Exploiting Estimation Bias in Clipped Double Q-Learning for Continous Control Reinforcement Learning Tasks | Feb 14, 2024 | Computational Efficiencycontinuous-control | —Unverified | 0 |
| Conservative and Risk-Aware Offline Multi-Agent Reinforcement Learning | Feb 13, 2024 | Multi-agent Reinforcement LearningQ-Learning | CodeCode Available | 0 |
| Intelligent Agricultural Management Considering N_2O Emission and Climate Variability with Uncertainties | Feb 13, 2024 | Decision MakingManagement | —Unverified | 0 |
| Enhanced Deep Q-Learning for 2D Self-Driving Cars: Implementation and Evaluation on a Custom Track Environment | Feb 13, 2024 | Q-LearningSelf-Driving Cars | —Unverified | 0 |
| Leveraging Digital Cousins for Ensemble Q-Learning in Large-Scale Wireless Networks | Feb 12, 2024 | Ensemble LearningManagement | CodeCode Available | 0 |
| Federated Deep Q-Learning and 5G load balancing | Feb 10, 2024 | Q-Learning | —Unverified | 0 |
| Solving Deep Reinforcement Learning Tasks with Evolution Strategies and Linear Policy Networks | Feb 10, 2024 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 0 |
| ORIENT: A Priority-Aware Energy-Efficient Approach for Latency-Sensitive Applications in 6G | Feb 10, 2024 | Q-Learning | —Unverified | 0 |
| Value function interference and greedy action selection in value-based multi-objective reinforcement learning | Feb 9, 2024 | Multi-Objective Reinforcement LearningQ-Learning | —Unverified | 0 |
| Attention-Enhanced Prioritized Proximal Policy Optimization for Adaptive Edge Caching | Feb 8, 2024 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Enhancement of High-definition Map Update Service Through Coverage-aware and Reinforcement Learning | Feb 8, 2024 | Autonomous DrivingAutonomous Vehicles | —Unverified | 0 |
| Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices | Feb 8, 2024 | Federated LearningOffline RL | —Unverified | 0 |
| Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization | Feb 8, 2024 | Q-Learningreinforcement-learning | CodeCode Available | 0 |
| A Deep Reinforcement Learning Approach for Adaptive Traffic Routing in Next-gen Networks | Feb 7, 2024 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents | Feb 6, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning | Feb 5, 2024 | D4RLQ-Learning | —Unverified | 0 |
| Multi-Agent Reinforcement Learning for Offloading Cellular Communications with Cooperating UAVs | Feb 5, 2024 | Decision MakingMulti-agent Reinforcement Learning | —Unverified | 0 |
| MinMaxMin Q-learning | Feb 3, 2024 | MuJoCoQ-Learning | —Unverified | 0 |
| SQT -- std Q-target | Feb 3, 2024 | MuJoCoQ-Learning | —Unverified | 0 |
| Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error | Feb 3, 2024 | Adversarial RobustnessDeep Reinforcement Learning | CodeCode Available | 1 |
| DRL-Based Dynamic Channel Access and SCLAR Maximization for Networks Under Jamming | Feb 2, 2024 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Deep Robot Sketching: An application of Deep Q-Learning Networks for human-like sketching | Feb 1, 2024 | Q-Learningreinforcement-learning | —Unverified | 0 |
| RadDQN: a Deep Q Learning-based Architecture for Finding Time-efficient Minimum Radiation Exposure Pathway | Feb 1, 2024 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 0 |
| FM3Q: Factorized Multi-Agent MiniMax Q-Learning for Two-Team Zero-Sum Markov Game | Feb 1, 2024 | Multi-agent Reinforcement LearningQ-Learning | —Unverified | 0 |
| Nash Soft Actor-Critic LEO Satellite Handover Management Algorithm for Flying Vehicles | Jan 31, 2024 | BlockingManagement | —Unverified | 0 |
| Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning | Jan 31, 2024 | Efficient ExplorationModel-based Reinforcement Learning | —Unverified | 0 |
| Extrinsicaly Rewarded Soft Q Imitation Learning with Discriminator | Jan 30, 2024 | Imitation LearningMuJoCo | —Unverified | 0 |
| Emergence of cooperation under punishment: A reinforcement learning perspective | Jan 29, 2024 | Imitation LearningQ-Learning | —Unverified | 0 |
| Regularized Q-Learning with Linear Function Approximation | Jan 26, 2024 | Decision Making Under UncertaintyQ-Learning | —Unverified | 0 |
| Constant Stepsize Q-learning: Distributional Convergence, Bias and Extrapolation | Jan 25, 2024 | Q-LearningReinforcement Learning (RL) | —Unverified | 0 |
| Information-Theoretic State Variable Selection for Reinforcement Learning | Jan 21, 2024 | Decision Makingfeature selection | CodeCode Available | 0 |
| VQC-Based Reinforcement Learning with Data Re-uploading: Performance and Trainability | Jan 21, 2024 | Q-Learningreinforcement-learning | CodeCode Available | 0 |
| REValueD: Regularised Ensemble Value-Decomposition for Factorisable Markov Decision Processes | Jan 16, 2024 | Multi-agent Reinforcement LearningQ-Learning | —Unverified | 0 |