| BIBI System Description: Building with CNNs and Breaking with Deep Reinforcement Learning | Sep 1, 2017 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 | 0 |
| Biomimetic Ultra-Broadband Perfect Absorbers Optimised with Reinforcement Learning | Oct 28, 2019 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Blackwell Online Learning for Markov Decision Processes | Dec 28, 2020 | Learning TheoryQ-Learning | —Unverified | 0 | 0 |
| BMG-Q: Localized Bipartite Match Graph Attention Q-Learning for Ride-Pooling Order Dispatch | Jan 23, 2025 | Graph AttentionGraph Sampling | —Unverified | 0 | 0 |
| BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL | May 28, 2025 | Bayesian OptimizationHyperparameter Optimization | —Unverified | 0 | 0 |
| Boosting Offline Reinforcement Learning with Residual Generative Modeling | Jun 19, 2021 | Offline RLQ-Learning | —Unverified | 0 | 0 |
| Bootstrapped Hindsight Experience replay with Counterintuitive Prioritization | Sep 29, 2021 | Q-Learning | —Unverified | 0 | 0 |
| Bootstrapping Expectiles in Reinforcement Learning | Jun 6, 2024 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Breaking the Deadly Triad with a Target Network | Jan 21, 2021 | Q-Learning | —Unverified | 0 | 0 |
| Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning | Oct 9, 2021 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Bridging the Gap Between Value and Policy Based Reinforcement Learning | Feb 28, 2017 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Bridging the Performance Gap Between Target-Free and Target-Based Reinforcement Learning With Iterated Q-Learning | Jun 4, 2025 | Q-Learning | —Unverified | 0 | 0 |
| Cache-Aided NOMA Mobile Edge Computing: A Reinforcement Learning Approach | Jun 20, 2019 | Edge-computingQ-Learning | —Unverified | 0 | 0 |
| Caching Placement and Resource Allocation for Cache-Enabling UAV NOMA Networks | Aug 12, 2020 | Q-LearningScheduling | —Unverified | 0 | 0 |
| CAN ALTQ LEARN FASTER: EXPERIMENTS AND THEORY | Sep 25, 2019 | Atari GamesQ-Learning | —Unverified | 0 | 0 |
| Can LLM be a Good Path Planner based on Prompt Engineering? Mitigating the Hallucination for Path Planning | Aug 23, 2024 | HallucinationPrompt Engineering | —Unverified | 0 | 0 |
| Can Q-Learning be Improved with Advice? | Oct 25, 2021 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Can Q-learning solve Multi Armed Bantids? | Oct 21, 2021 | Decision MakingQ-Learning | —Unverified | 0 | 0 |
| Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory | Jun 8, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 | 0 |
| Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory | Dec 1, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 | 0 |
| CAQL: Continuous Action Q-Learning | Sep 26, 2019 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Career Path Recommendations for Long-term Income Maximization: A Reinforcement Learning Approach | Sep 11, 2023 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| CARL-DTN: Context Adaptive Reinforcement Learning based Routing Algorithm in Delay Tolerant Network | May 2, 2021 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Catalytic evolution of cooperation in a population with behavioural bimodality | Jun 17, 2024 | Q-Learning | —Unverified | 0 | 0 |
| Catch Me If You Can: Improving Adversaries in Cyber-Security With Q-Learning Algorithms | Feb 7, 2023 | Q-Learning | —Unverified | 0 | 0 |
| Causal Deep Reinforcement Learning Using Observational Data | Nov 28, 2022 | Autonomous DrivingCausal Inference | —Unverified | 0 | 0 |
| Causal Mean Field Multi-Agent Reinforcement Learning | Feb 20, 2025 | Multi-agent Reinforcement LearningQ-Learning | —Unverified | 0 | 0 |
| Cell Switching in HAPS-Aided Networking: How the Obscurity of Traffic Loads Affects the Decision | May 1, 2024 | Q-Learning | —Unverified | 0 | 0 |
| Cellular traffic offloading via Opportunistic Networking with Reinforcement Learning | Oct 1, 2021 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Censored Deep Reinforcement Patrolling with Information Criterion for Monitoring Large Water Resources using Autonomous Surface Vehicles | Oct 12, 2022 | Autonomous VehiclesQ-Learning | —Unverified | 0 | 0 |
| Challenging On Car Racing Problem from OpenAI gym | Nov 2, 2019 | Car Racingcontinuous-control | —Unverified | 0 | 0 |
| Channel Estimation via Successive Denoising in MIMO OFDM Systems: A Reinforcement Learning Approach | Jan 25, 2021 | DenoisingQ-Learning | —Unverified | 0 | 0 |
| Characterizing the Action-Generalization Gap in Deep Q-Learning | May 11, 2022 | Q-LearningReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Chemoreception and chemotaxis of a three-sphere swimmer | May 5, 2022 | Q-Learning | —Unverified | 0 | 0 |
| Chrome Dino Run using Reinforcement Learning | Aug 15, 2020 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| C-Learning: Learning to Achieve Goals via Recursive Classification | Nov 17, 2020 | ClassificationDensity Estimation | —Unverified | 0 | 0 |
| Collaborative Deep Reinforcement Learning for Joint Object Search | Feb 18, 2017 | Active Object LocalizationDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear | Nov 3, 2016 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Combining policy gradient and Q-learning | Nov 5, 2016 | Atari GamesQ-Learning | —Unverified | 0 | 0 |
| Combining Q-Learning and Search with Amortized Value Estimates | Dec 5, 2019 | Q-Learning | —Unverified | 0 | 0 |
| Comparative Analysis of Multi-Agent Reinforcement Learning Policies for Crop Planning Decision Support | Dec 3, 2024 | Computational EfficiencyFairness | —Unverified | 0 | 0 |
| Comparative Study of Q-Learning and NeuroEvolution of Augmenting Topologies for Self Driving Agents | Sep 19, 2022 | Autonomous DrivingEvolutionary Algorithms | —Unverified | 0 | 0 |
| Comparing NARS and Reinforcement Learning: An Analysis of ONA and Q-Learning Algorithms | Mar 17, 2023 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Compositional Reinforcement Learning for Discrete-Time Stochastic Control Systems | Aug 6, 2022 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Compressive Features in Offline Reinforcement Learning for Recommender Systems | Nov 16, 2021 | Q-LearningRecommendation Systems | —Unverified | 0 | 0 |
| Computation Offloading for Uncertain Marine Tasks by Cooperation of UAVs and Vessels | Feb 13, 2023 | Q-Learning | —Unverified | 0 | 0 |
| Computing and Learning Stationary Mean Field Equilibria with Scalar Interactions: Algorithms and Applications | Feb 2, 2025 | counterfactualPolicy Gradient Methods | —Unverified | 0 | 0 |
| Concentration bounds for SSP Q-learning for average cost MDPs | Jun 7, 2022 | Q-Learning | —Unverified | 0 | 0 |
| Concentration of Contractive Stochastic Approximation and Reinforcement Learning | Jun 27, 2021 | Q-Learningreinforcement-learning | —Unverified | 0 | 0 |
| Concentration of Contractive Stochastic Approximation: Additive and Multiplicative Noise | Mar 28, 2023 | Q-Learning | —Unverified | 0 | 0 |