| GTA1: GUI Test-time Scaling Agent | Jul 8, 2025 | Reinforcement Learning (RL)Task Planning | CodeCode Available | 2 |
| MedPrompt: LLM-CNN Fusion with Weight Routing for Medical Image Segmentation and Classification | Jun 26, 2025 | Image SegmentationLarge Language Model | —Unverified | 0 |
| VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models | Jun 21, 2025 | Action GenerationContinual Learning | —Unverified | 0 |
| Towards AI Search Paradigm | Jun 20, 2025 | Decision MakingRetrieval-augmented Generation | —Unverified | 0 |
| Multimodal Fused Learning for Solving the Generalized Traveling Salesman Problem in Robotic Task Planning | Jun 20, 2025 | Computational EfficiencyTask Planning | —Unverified | 0 |
| A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications | Jun 14, 2025 | Information RetrievalSurvey | CodeCode Available | 3 |
| Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills | Jun 12, 2025 | Large Language ModelTask Planning | —Unverified | 0 |
| VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning | Jun 10, 2025 | Task PlanningVisual Reasoning | —Unverified | 0 |
| Language-Vision Planner and Executor for Text-to-Visual Reasoning | Jun 9, 2025 | In-Context LearningMME | —Unverified | 0 |
| Prime the search: Using large language models for guiding geometric task and motion planning by warm-starting tree search | Jun 8, 2025 | Common Sense ReasoningMotion Planning | CodeCode Available | 0 |
| RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks | Jun 7, 2025 | Large Language ModelTask Planning | —Unverified | 0 |
| Hierarchical Debate-Based Large Language Model (LLM) for Complex Task Planning of 6G Network Management | Jun 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Understanding Physical Properties of Unseen Deformable Objects by Leveraging Large Language Models and Robot Actions | Jun 4, 2025 | Motion PlanningTask and Motion Planning | —Unverified | 0 |
| ChemGraph: An Agentic Framework for Computational Chemistry Workflows | Jun 3, 2025 | Computational chemistryGraph Neural Network | —Unverified | 0 |
| FlySearch: Exploring how vision-language models explore | Jun 3, 2025 | HallucinationTask Planning | CodeCode Available | 1 |
| Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs | Jun 3, 2025 | ObjectObject Rearrangement | —Unverified | 0 |
| Grounded Vision-Language Interpreter for Integrated Task and Motion Planning | Jun 3, 2025 | Motion PlanningTask and Motion Planning | —Unverified | 0 |
| LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks | May 31, 2025 | Task PlanningVision-Language-Action | —Unverified | 0 |
| MASTER: Multi-Agent Security Through Exploration of Roles and Topological Structures -- A Comprehensive Framework | May 24, 2025 | Task Planning | —Unverified | 0 |
| BEDI: A Comprehensive Benchmark for Evaluating Embodied Agents on UAVs | May 23, 2025 | Model OptimizationTask Planning | CodeCode Available | 1 |
| CRAKEN: Cybersecurity LLM Agent with Knowledge-Based Execution | May 21, 2025 | Large Language ModelTask Planning | CodeCode Available | 1 |
| Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets | May 21, 2025 | Dataset GenerationDescriptive | —Unverified | 0 |
| Building a Stable Planner: An Extended Finite State Machine Based Planning Module for Mobile GUI Agent | May 20, 2025 | Task Planning | —Unverified | 0 |
| APEX: Empowering LLMs with Physics-Based Task Planning for Real-time Insight | May 20, 2025 | Causal InferenceDecision Making | CodeCode Available | 0 |
| REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning? | May 16, 2025 | Large Language ModelRobot Task Planning | —Unverified | 0 |