| Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning | Apr 2, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Inverse RL Scene Dynamics Learning for Nonlinear Predictive Control in Autonomous Vehicles | Apr 2, 2025 | Autonomous NavigationAutonomous Vehicles | —Unverified | 0 |
| Late Breaking Results: Breaking Symmetry- Unconventional Placement of Analog Circuits using Multi-Level Multi-Agent Reinforcement Learning | Mar 29, 2025 | Multi-agent Reinforcement LearningQ-Learning | —Unverified | 0 |
| Optimal Path Planning and Cost Minimization for a Drone Delivery System Via Model Predictive Control | Mar 25, 2025 | Model Predictive ControlMulti-agent Reinforcement Learning | —Unverified | 0 |
| Reinforcement Learning in Switching Non-Stationary Markov Decision Processes: Algorithms and Convergence Analysis | Mar 24, 2025 | Decision MakingQ-Learning | —Unverified | 0 |
| Finite-Time Bounds for Two-Time-Scale Stochastic Approximation with Arbitrary Norm Contractions and Markovian Noise | Mar 24, 2025 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Bandwidth Reservation for Time-Critical Vehicular Applications: A Multi-Operator Environment | Mar 22, 2025 | Deep Reinforcement LearningFairness | —Unverified | 0 |
| Planning and Learning in Average Risk-aware MDPs | Mar 22, 2025 | Q-Learning | —Unverified | 0 |
| Deep Q-Learning with Gradient Target Tracking | Mar 20, 2025 | Q-Learning | —Unverified | 0 |
| APF+: Boosting adaptive-potential function reinforcement learning methods with a W-shaped network for high-dimensional games | Mar 17, 2025 | Atari GamesQ-Learning | —Unverified | 0 |
| Residual Policy Gradient: A Reward View of KL-regularized Objective | Mar 14, 2025 | Imitation LearningMuJoCo | —Unverified | 0 |
| Exploring Competitive and Collusive Behaviors in Algorithmic Pricing with Deep Reinforcement Learning | Mar 14, 2025 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| Multi-Agent Q-Learning Dynamics in Random Networks: Convergence due to Exploration and Sparsity | Mar 13, 2025 | Q-LearningStochastic Block Model | —Unverified | 0 |
| PairVDN - Pair-wise Decomposed Value Functions | Mar 12, 2025 | Q-Learning | CodeCode Available | 0 |
| A Novel Multi-Objective Reinforcement Learning Algorithm for Pursuit-Evasion Game | Mar 9, 2025 | Multi-Objective Reinforcement LearningQ-Learning | —Unverified | 0 |
| Generative Multi-Agent Q-Learning for Policy Optimization: Decentralized Wireless Networks | Mar 7, 2025 | Q-LearningReinforcement Learning (RL) | —Unverified | 0 |
| Quantum-Inspired Reinforcement Learning in the Presence of Epistemic Ambivalence | Mar 6, 2025 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| Multi-Agent Inverse Q-Learning from Demonstrations | Mar 6, 2025 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| DO-IQS: Dynamics-Aware Offline Inverse Q-Learning for Optimal Stopping with Unknown Gain Functions | Mar 5, 2025 | Q-Learning | —Unverified | 0 |
| Navigating Intelligence: A Survey of Google OR-Tools and Machine Learning for Global Path Planning in Autonomous Vehicles | Mar 5, 2025 | Autonomous VehiclesQ-Learning | —Unverified | 0 |
| POPGym Arcade: Parallel Pixelated POMDPs | Mar 3, 2025 | counterfactualImitation Learning | CodeCode Available | 1 |
| An Efficient and Uncertainty-aware Reinforcement Learning Framework for Quality Assurance in Extrusion Additive Manufacturing | Mar 2, 2025 | Q-LearningUncertainty Quantification | —Unverified | 0 |
| Nucleolus Credit Assignment for Effective Coalitions in Multi-agent Reinforcement Learning | Mar 1, 2025 | Multi-agent Reinforcement LearningQ-Learning | —Unverified | 0 |
| Cycles and collusion in congestion games under Q-learning | Feb 26, 2025 | Q-Learning | —Unverified | 0 |
| Policy Learning with a Natural Language Action Space: A Causal Approach | Feb 24, 2025 | Decision MakingQ-Learning | —Unverified | 0 |
| Yes, Q-learning Helps Offline In-Context RL | Feb 24, 2025 | In-Context Reinforcement LearningMuJoCo | —Unverified | 0 |
| Algorithmic Collusion under Observed Demand Shocks | Feb 20, 2025 | Q-Learning | —Unverified | 0 |
| Is Q-learning an Ill-posed Problem? | Feb 20, 2025 | Q-Learningreinforcement-learning | —Unverified | 0 |
| Causal Mean Field Multi-Agent Reinforcement Learning | Feb 20, 2025 | Multi-agent Reinforcement LearningQ-Learning | —Unverified | 0 |
| A Non-Asymptotic Theory of Seminorm Lyapunov Stability: From Deterministic to Stochastic Iterative Algorithms | Feb 20, 2025 | Q-Learning | —Unverified | 0 |
| Multi-Objective Reinforcement Learning for Critical Scenario Generation of Autonomous Vehicles | Feb 18, 2025 | Autonomous VehiclesMulti-Objective Reinforcement Learning | —Unverified | 0 |
| Digi-Q: Learning Q-Value Functions for Training Device-Control Agents | Feb 13, 2025 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 2 |
| Few is More: Task-Efficient Skill-Discovery for Multi-Task Offline Multi-Agent Reinforcement Learning | Feb 13, 2025 | Learning to ExecuteMulti-agent Reinforcement Learning | —Unverified | 0 |
| Evolution of cooperation in a bimodal mixture of conditional cooperators | Feb 11, 2025 | Q-Learning | CodeCode Available | 0 |
| ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy | Feb 8, 2025 | Q-LearningSafe Exploration | CodeCode Available | 3 |
| Optimizing Wireless Resource Management and Synchronization in Digital Twin Networks | Feb 7, 2025 | ManagementQ-Learning | —Unverified | 0 |
| Seasonal Station-Keeping of Short Duration High Altitude Balloons using Deep Reinforcement Learning | Feb 7, 2025 | Deep Reinforcement LearningDiversity | —Unverified | 0 |
| Fast Adaptive Anti-Jamming Channel Access via Deep Q Learning and Coarse-Grained Spectrum Prediction | Feb 7, 2025 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 |
| CleanSurvival: Automated data preprocessing for time-to-event models using reinforcement learning | Feb 6, 2025 | ImputationOutlier Detection | CodeCode Available | 0 |
| DECAF: Learning to be Fair in Multi-agent Resource Allocation | Feb 6, 2025 | FairnessQ-Learning | —Unverified | 0 |
| VistaFlow: Photorealistic Volumetric Reconstruction with Dynamic Resolution Management via Q-Learning | Feb 5, 2025 | CPUManagement | —Unverified | 0 |
| Gap-Dependent Bounds for Federated Q-learning | Feb 5, 2025 | Q-Learning | —Unverified | 0 |
| Efficient Triangular Arbitrage Detection via Graph Neural Networks | Feb 5, 2025 | Q-Learning | —Unverified | 0 |
| Flow Q-Learning | Feb 4, 2025 | Action GenerationD4RL | CodeCode Available | 3 |
| Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer | Feb 4, 2025 | Q-LearningSMAC | CodeCode Available | 0 |
| Resilient UAV Trajectory Planning via Few-Shot Meta-Offline Reinforcement Learning | Feb 3, 2025 | Meta-LearningOffline RL | —Unverified | 0 |
| Computing and Learning Stationary Mean Field Equilibria with Scalar Interactions: Algorithms and Applications | Feb 2, 2025 | counterfactualPolicy Gradient Methods | —Unverified | 0 |
| An MDP Model for Censoring in Harvesting Sensors: Optimal and Approximated Solutions | Feb 2, 2025 | Q-Learning | —Unverified | 0 |
| Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network | Feb 1, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Linear Q-Learning Does Not Diverge: Convergence Rates to a Bounded Set | Jan 31, 2025 | Q-Learning | —Unverified | 0 |