Curriculum-Guided Antifragile Reinforcement Learning for Secure UAV Deconfliction under Observation-Space Attacks Jun 26, 2025 Decision Making Reinforcement Learning (RL)
— Unverified 0Homogenization of Multi-agent Learning Dynamics in Finite-state Markov Games Jun 26, 2025 Reinforcement Learning (RL)
Code Code Available 0Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards Jun 25, 2025 Reinforcement Learning (RL)
— Unverified 0Complex Model Transformations by Reinforcement Learning with Uncertain Human Guidance Jun 25, 2025 Reinforcement Learning (RL)
Code Code Available 0Reinforcement Learning Increases Wind Farm Power Production by Enabling Closed-Loop Collaborative Control Jun 25, 2025 Bayesian Optimization Reinforcement Learning (RL)
Code Code Available 0Causal-Aware Intelligent QoE Optimization for VR Interaction with Adaptive Keyframe Extraction Jun 24, 2025 Causal Inference CPU
— Unverified 0A Comparative Analysis of Reinforcement Learning and Conventional Deep Learning Approaches for Bearing Fault Diagnosis Jun 24, 2025 Diagnostic Fault Diagnosis
— Unverified 0Hierarchical Reinforcement Learning and Value Optimization for Challenging Quadruped Locomotion Jun 24, 2025 Hierarchical Reinforcement Learning reinforcement-learning
— Unverified 0Partially Observable Residual Reinforcement Learning for PV-Inverter-Based Voltage Control in Distribution Grids Jun 24, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 0Robots and Children that Learn Together : Improving Knowledge Retention by Teaching Peer-Like Interactive Robots Jun 23, 2025 Memorization Reinforcement Learning (RL)
— Unverified 0AdapThink: Adaptive Thinking Preferences for Reasoning Language Model Jun 23, 2025 Diversity Language Modeling
— Unverified 0Accelerating Residual Reinforcement Learning with Uncertainty Estimation Jun 21, 2025 D4RL reinforcement-learning
— Unverified 0Leveling the Playing Field: Carefully Comparing Classical and Learned Controllers for Quadrotor Trajectory Tracking Jun 21, 2025 Benchmarking Reinforcement Learning (RL)
— Unverified 0Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation Jun 20, 2025 Reinforcement Learning (RL)
Code Code Available 0Learning Dexterous Object Handover Jun 20, 2025 Object Reinforcement Learning (RL)
— Unverified 0Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity Jun 20, 2025 continuous-control Continuous Control
Code Code Available 0VRAIL: Vectorized Reward-based Attribution for Interpretable Learning Jun 19, 2025 Reinforcement Learning (RL)
— Unverified 0Dual-Objective Reinforcement Learning with Novel Hamilton-Jacobi-Bellman Formulations Jun 19, 2025 Reinforcement Learning (RL)
— Unverified 0Multi-Task Lifelong Reinforcement Learning for Wireless Sensor Networks Jun 19, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0From General to Targeted Rewards: Surpassing GPT-4 in Open-Ended Long-Context Generation Jun 19, 2025 Dataset Generation Reinforcement Learning (RL)
— Unverified 0Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study Jun 18, 2025 Earth Observation Management
— Unverified 0Make Your AUV Adaptive: An Environment-Aware Reinforcement Learning Framework For Underwater Tasks Jun 18, 2025 Decision Making Language Modeling
— Unverified 0Steering Your Diffusion Policy with Latent Space Reinforcement Learning Jun 18, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Reinforcement Learning-Based Policy Optimisation For Heterogeneous Radio Access Jun 18, 2025 Q-Learning reinforcement-learning
— Unverified 0PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning Jun 17, 2025 General Reinforcement Learning Multimodal Reasoning
— Unverified 0Adaptive Reinforcement Learning for Unobservable Random Delays Jun 17, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Reasoning with Exploration: An Entropy Perspective Jun 17, 2025 Reinforcement Learning (RL)
— Unverified 0Zeroth-Order Optimization is Secretly Single-Step Policy Optimization Jun 17, 2025 Reinforcement Learning (RL)
— Unverified 0Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs Jun 17, 2025 Data Integration Large Language Model
— Unverified 0IntelliLung: Advancing Safe Mechanical Ventilation using Offline RL with Hybrid Actions and Clinically Aligned Rewards Jun 17, 2025 Offline RL Reinforcement Learning (RL)
— Unverified 0HiLight: A Hierarchical Reinforcement Learning Framework with Global Adversarial Guidance for Large-Scale Traffic Signal Control Jun 17, 2025 Hierarchical Reinforcement Learning reinforcement-learning
— Unverified 0Unsupervised Skill Discovery through Skill Regions Differentiation Jun 17, 2025 Density Estimation Reinforcement Learning (RL)
— Unverified 0Socratic RL: A Novel Framework for Efficient Knowledge Acquisition through Iterative Reflection and Viewpoint Distillation Jun 16, 2025 Meta-Learning reinforcement-learning
— Unverified 0ReinDSplit: Reinforced Dynamic Split Learning for Pest Recognition in Precision Agriculture Jun 16, 2025 Q-Learning Reinforcement Learning (RL)
— Unverified 0Overcoming Overfitting in Reinforcement Learning via Gaussian Process Diffusion Policy Jun 16, 2025 GPR Reinforcement Learning (RL)
Code Code Available 0AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy Jun 16, 2025 Math Reinforcement Learning (RL)
— Unverified 0A Technical Study into Small Reasoning Language Models Jun 16, 2025 Code Generation Computational Efficiency
— Unverified 0StaQ it! Growing neural networks for Policy Mirror Descent Jun 16, 2025 Reinforcement Learning (RL)
— Unverified 0The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning Jun 16, 2025 Deep Reinforcement Learning MuJoCo
— Unverified 0Value-Free Policy Optimization via Reward Partitioning Jun 16, 2025 Language Modeling Language Modelling
Code Code Available 0Can you see how I learn? Human observers' inferences about Reinforcement Learning agents' learning processes Jun 16, 2025 Reinforcement Learning (RL)
— Unverified 0Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning Jun 16, 2025 Reinforcement Learning (RL)
— Unverified 0RL-Guided MPC for Autonomous Greenhouse Control Jun 16, 2025 Model Predictive Control Reinforcement Learning (RL)
— Unverified 0Federated Neuroevolution O-RAN: Enhancing the Robustness of Deep Reinforcement Learning xApps Jun 15, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making Jun 15, 2025 Answer Generation Decision Making
— Unverified 0DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty Jun 14, 2025 continuous-control Continuous Control
Code Code Available 0MM-R5: MultiModal Reasoning-Enhanced ReRanker via Reinforcement Learning for Document Retrieval Jun 14, 2025 Instruction Following Multimodal Reasoning
Code Code Available 0ReVeal: Self-Evolving Code Agents via Iterative Generation-Verification Jun 13, 2025 Code Generation reinforcement-learning
— Unverified 0Automated Treatment Planning for Interstitial HDR Brachytherapy for Locally Advanced Cervical Cancer using Deep Reinforcement Learning Jun 13, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Eliciting Reasoning in Language Models with Cognitive Tools Jun 13, 2025 Mathematical Reasoning Reinforcement Learning (RL)
— Unverified 0