One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms Jul 21, 2025 Multi-agent Reinforcement Learning
Code Code Available 0A Learning Framework For Cooperative Collision Avoidance of UAV Swarms Leveraging Domain Knowledge Jul 15, 2025 Collision Avoidance Multi-agent Reinforcement Learning
— Unverified 0Artificial Generals Intelligence: Mastering Generals.io with Reinforcement Learning Jul 9, 2025 GPU Multi-agent Reinforcement Learning
— Unverified 0SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Jun 30, 2025 Math Multi-agent Reinforcement Learning
Code Code Available 2The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind Jun 25, 2025 Multi-agent Reinforcement Learning Navigate
Code Code Available 1Center of Gravity-Guided Focusing Influence Mechanism for Multi-Agent Reinforcement Learning Jun 24, 2025 counterfactual Multi-agent Reinforcement Learning
— Unverified 0Learning Bilateral Team Formation in Cooperative Multi-Agent Reinforcement Learning Jun 24, 2025 Multi-agent Reinforcement Learning
— Unverified 0Transformer World Model for Sample Efficient Multi-Agent Reinforcement Learning Jun 23, 2025 Multi-agent Reinforcement Learning Starcraft
Code Code Available 0Generalizable Agent Modeling for Agent Collaboration-Competition Adaptation with Multi-Retrieval and Dynamic Generation Jun 20, 2025 Multi-agent Reinforcement Learning SMAC
Code Code Available 0Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study Jun 18, 2025 Earth Observation Management
— Unverified 0Light Aircraft Game : Basic Implementation and training results analysis Jun 17, 2025 Multi-agent Reinforcement Learning
Code Code Available 0Dynamic Reinsurance Treaty Bidding via Multi-Agent Reinforcement Learning Jun 16, 2025 Multi-agent Reinforcement Learning reinforcement-learning
— Unverified 0MARCO: Hardware-Aware Neural Architecture Search for Edge Devices with Multi-Agent Reinforcement Learning and Conformal Prediction Filtering Jun 16, 2025 Conformal Prediction Hardware Aware Neural Architecture Search
— Unverified 0Homeostatic Coupling for Prosocial Behavior Jun 15, 2025 Multi-agent Reinforcement Learning
— Unverified 0Wasserstein-Barycenter Consensus for Cooperative Multi-Agent Reinforcement Learning Jun 14, 2025 Multi-agent Reinforcement Learning reinforcement-learning
— Unverified 0Trust-MARL: Trust-Based Multi-Agent Reinforcement Learning Framework for Cooperative On-Ramp Merging Control in Heterogeneous Traffic Flow Jun 14, 2025 Multi-agent Reinforcement Learning
— Unverified 0Multi-Agent Language Models: Advancing Cooperation, Coordination, and Adaptation Jun 11, 2025 Multi-agent Reinforcement Learning
— Unverified 0When Is Diversity Rewarded in Cooperative Multi-Agent Learning? Jun 11, 2025 Diversity Multi-agent Reinforcement Learning
— Unverified 0Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models Jun 9, 2025 Multi-agent Reinforcement Learning Safety Alignment
Code Code Available 1Curriculum Learning With Counterfactual Group Relative Policy Advantage For Multi-Agent Reinforcement Learning Jun 9, 2025 counterfactual Multi-agent Reinforcement Learning
Code Code Available 1Decentralizing Multi-Agent Reinforcement Learning with Temporal Causal Information Jun 9, 2025 Multi-agent Reinforcement Learning reinforcement-learning
— Unverified 0Ego-centric Learning of Communicative World Models for Autonomous Driving Jun 9, 2025 Autonomous Driving Multi-agent Reinforcement Learning
— Unverified 0Learn as Individuals, Evolve as a Team: Multi-agent LLMs Adaptation in Embodied Environments Jun 8, 2025 Multi-agent Reinforcement Learning
— Unverified 0Policy Optimization for Continuous-time Linear-Quadratic Graphon Mean Field Games Jun 6, 2025 Bilevel Optimization Multi-agent Reinforcement Learning
— Unverified 0A MARL-based Approach for Easing MAS Organization Engineering Jun 5, 2025 Multi-agent Reinforcement Learning
— Unverified 0