| Average Reward Reinforcement Learning with Monotonic Policy Improvement | Jan 1, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning | Jul 1, 2022 | Deep Reinforcement Learning | —Unverified | 0 | 0 |
| A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle Avoidance | Mar 11, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| A Visual Communication Map for Multi-Agent Deep Reinforcement Learning | Feb 27, 2020 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| Avoidance Navigation Based on Offline Pre-Training Reinforcement Learning | Aug 3, 2023 | Deep Reinforcement LearningNavigate | —Unverified | 0 | 0 |
| Avoiding Catastrophic States with Intrinsic Fear | Jan 1, 2018 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 | 0 |
| AWD3: Dynamic Reduction of the Estimation Bias | Nov 12, 2021 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| A Wireless Collaborated Inference Acceleration Framework for Plant Disease Recognition | May 5, 2025 | Collaborative InferenceDeep Reinforcement Learning | —Unverified | 0 | 0 |
| AXIOM: Learning to Play Games in Minutes with Expanding Object-Centric Models | May 30, 2025 | Deep Reinforcement Learning | —Unverified | 0 | 0 |
| A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire Navigation | Mar 5, 2024 | Deep Reinforcement LearningNavigate | —Unverified | 0 | 0 |
| Bach2Bach: Generating Music Using A Deep Reinforcement Learning Approach | Dec 3, 2018 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| Backbones-Review: Feature Extraction Networks for Deep Learning and Deep Reinforcement Learning Approaches | Jun 16, 2022 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning | May 2, 2021 | Atari GamesBackdoor Attack | —Unverified | 0 | 0 |
| Backdoors in DRL: Four Environments Focusing on In-distribution Triggers | May 22, 2025 | Backdoor AttackData Poisoning | —Unverified | 0 | 0 |
| Balance Between Efficient and Effective Learning: Dense2Sparse Reward Shaping for Robot Manipulation with Environment Uncertainty | Mar 5, 2020 | Deep Reinforcement LearningReinforcement Learning | —Unverified | 0 | 0 |
| Balancing SoC in Battery Cells using Safe Action Perturbations | Mar 11, 2025 | Deep Reinforcement LearningReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Bandwidth Reservation for Time-Critical Vehicular Applications: A Multi-Operator Environment | Mar 22, 2025 | Deep Reinforcement LearningFairness | —Unverified | 0 | 0 |
| Barrier Function-based Safe Reinforcement Learning for Emergency Control of Power Systems | Mar 26, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| Basal Glucose Control in Type 1 Diabetes using Deep Reinforcement Learning: An In Silico Validation | May 18, 2020 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 | 0 |
| BASIL: Best-Action Symbolic Interpretable Learning for Evolving Compact RL Policies | May 31, 2025 | AcrobotDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Batch-Constrained Distributional Reinforcement Learning for Session-based Recommendation | Dec 16, 2020 | Deep Reinforcement LearningDistributional Reinforcement Learning | —Unverified | 0 | 0 |
| BCQQ: Batch-Constraint Quantum Q-Learning with Cyclic Data Re-uploading | Apr 27, 2023 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 | 0 |
| Battery and Hydrogen Energy Storage Control in a Smart Energy Network with Flexible Energy Demand using Deep Reinforcement Learning | Aug 26, 2022 | Deep Reinforcement LearningScheduling | —Unverified | 0 | 0 |
| Battery Model Calibration with Deep Reinforcement Learning | Dec 7, 2020 | BIG-bench Machine LearningDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics | Jul 21, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| Bayesian Optimization Enhanced Deep Reinforcement Learning for Trajectory Planning and Network Formation in Multi-UAV Networks | Dec 27, 2022 | Bayesian OptimizationDeep Reinforcement Learning | —Unverified | 0 | 0 |
| BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems | Nov 15, 2017 | Deep Reinforcement LearningEfficient Exploration | —Unverified | 0 | 0 |
| BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems | Aug 17, 2016 | Deep Reinforcement LearningEfficient Exploration | —Unverified | 0 | 0 |
| BCEdge: SLO-Aware DNN Inference Services with Adaptive Batching on Edge Platforms | May 1, 2023 | Deep Reinforcement LearningScheduling | —Unverified | 0 | 0 |
| β-DQN: Improving Deep Q-Learning By Evolving the Behavior | Jan 1, 2025 | Deep Reinforcement LearningEfficient Exploration | —Unverified | 0 | 0 |
| Beam Selection in ISAC using Contextual Bandit with Multi-modal Transformer and Transfer Learning | Mar 11, 2025 | Beam PredictionDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Behavioral decision-making for urban autonomous driving in the presence of pedestrians using Deep Recurrent Q-Network | Oct 26, 2020 | Autonomous DrivingDecision Making | —Unverified | 0 | 0 |
| Behaviorally Diverse Traffic Simulation via Reinforcement Learning | Nov 11, 2020 | Autonomous DrivingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Behavior Imitation for Manipulator Control and Grasping with Deep Reinforcement Learning | May 2, 2024 | 3D Human Pose EstimationDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Behaviour-conditioned policies for cooperative reinforcement learning tasks | Oct 4, 2021 | Deep Reinforcement LearningMeta-Learning | —Unverified | 0 | 0 |
| Behaviour-Diverse Automatic Penetration Testing: A Curiosity-Driven Multi-Objective Deep Reinforcement Learning Approach | Feb 22, 2022 | Deep Reinforcement LearningMulti-Objective Reinforcement Learning | —Unverified | 0 | 0 |
| Benchmark for Out-of-Distribution Detection in Deep Reinforcement Learning | Dec 5, 2021 | Deep Reinforcement LearningOut-of-Distribution Detection | —Unverified | 0 | 0 |
| Benchmarking Deep Reinforcement Learning Algorithms for Vision-based Robotics | Jan 11, 2022 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Benchmarking Feature Extractors for Reinforcement Learning-Based Semiconductor Defect Localization | Nov 18, 2023 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Benchmarking Lane-changing Decision-making for Deep Reinforcement Learning | Sep 22, 2021 | Autonomous DrivingBenchmarking | —Unverified | 0 | 0 |
| Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms | Jan 1, 2021 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Benchmarking Robustness of Deep Reinforcement Learning approaches to Online Portfolio Management | Jun 19, 2023 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation | Dec 16, 2021 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Best Response Shaping | Apr 5, 2024 | Deep Reinforcement LearningQuestion Answering | —Unverified | 0 | 0 |
| BET: Explaining Deep Reinforcement Learning through The Error-Prone Decisions | Jan 14, 2024 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Beyond Tabula-Rasa: a Modular Reinforcement Learning Approach for Physically Embedded 3D Sokoban | Oct 3, 2020 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| Beyond Traditional DoE: Deep Reinforcement Learning for Optimizing Experiments in Model Identification of Battery Dynamics | Oct 12, 2023 | Deep Reinforcement Learningenergy management | —Unverified | 0 | 0 |
| Beyond Training: Optimizing Reinforcement Learning Based Job Shop Scheduling Through Adaptive Action Sampling | Jun 11, 2024 | Deep Reinforcement LearningJob Shop Scheduling | —Unverified | 0 | 0 |
| Beyond Training-time Poisoning: Component-level and Post-training Backdoors in Deep Reinforcement Learning | Jul 7, 2025 | Backdoor AttackDeep Reinforcement Learning | —Unverified | 0 | 0 |
| BIBI System Description: Building with CNNs and Breaking with Deep Reinforcement Learning | Sep 1, 2017 | Deep Reinforcement LearningQ-Learning | —Unverified | 0 | 0 |