| World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning | Mar 13, 2025 | Task Planning | —Unverified | 0 |
| SurgicalVLM-Agent: Towards an Interactive AI Co-Pilot for Pituitary Surgery | Mar 12, 2025 | Activity RecognitionAnatomy | —Unverified | 0 |
| General-Purpose Aerial Intelligent Agents Empowered by Large Language Models | Mar 11, 2025 | Motion PlanningScene Understanding | —Unverified | 0 |
| Investigating the Effectiveness of a Socratic Chain-of-Thoughts Reasoning Method for Task Planning in Robotics, A Case Study | Mar 11, 2025 | Code GenerationTask Planning | —Unverified | 0 |
| Self-Corrective Task Planning by Inverse Prompting with Large Language Models | Mar 10, 2025 | Robot Task PlanningTask Planning | —Unverified | 0 |
| Graphormer-Guided Task Planning: Beyond Static Rules with LLM Safety Perception | Mar 10, 2025 | Task Planning | CodeCode Available | 0 |
| STAR: A Foundation Model-driven Framework for Robust Task Planning and Failure Recovery in Robotic Systems | Mar 8, 2025 | Information RetrievalKnowledge Graphs | —Unverified | 0 |
| Safe LLM-Controlled Robots with Formal Guarantees via Reachability Analysis | Mar 5, 2025 | Autonomous NavigationNavigate | CodeCode Available | 0 |
| Improving Retrospective Language Agents via Joint Policy Gradient Optimization | Mar 3, 2025 | Decision MakingImitation Learning | —Unverified | 0 |
| CLEA: Closed-Loop Embodied Agent for Enhancing Task Execution in Dynamic Environments | Mar 2, 2025 | Task Planning | CodeCode Available | 0 |
| Structured Preference Optimization for Vision-Language Long-Horizon Task Planning | Feb 28, 2025 | Task PlanningVisual Grounding | —Unverified | 0 |
| RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete | Feb 28, 2025 | Task PlanningTrajectory Prediction | —Unverified | 0 |
| MRBTP: Efficient Multi-Robot Behavior Tree Planning and Collaboration | Feb 25, 2025 | Robot Task PlanningTask Planning | CodeCode Available | 1 |
| RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents | Feb 23, 2025 | Task Planning | —Unverified | 0 |
| Plan-over-Graph: Towards Parallelable LLM Agent Schedule | Feb 20, 2025 | Task Planning | CodeCode Available | 1 |
| Towards Robust and Secure Embodied AI: A Survey on Vulnerabilities and Attacks | Feb 18, 2025 | Adversarial AttackAutonomous Vehicles | —Unverified | 0 |
| Scaling Autonomous Agents via Automatic Reward Modeling And Planning | Feb 17, 2025 | Decision MakingMathematical Problem-Solving | —Unverified | 0 |
| NavRAG: Generating User Demand Instructions for Embodied Navigation through Retrieval-Augmented LLM | Feb 16, 2025 | NavigateRAG | CodeCode Available | 2 |
| OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning | Feb 16, 2025 | MedQAMMLU | —Unverified | 0 |
| D-CIPHER: Dynamic Collaborative Intelligent Multi-Agent System with Planner and Heterogeneous Executors for Offensive Security | Feb 15, 2025 | Task Planning | CodeCode Available | 2 |
| STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning | Feb 14, 2025 | Decision MakingSpatial Reasoning | —Unverified | 0 |
| Vote-Tree-Planner: Optimizing Execution Order in LLM-based Task Planning Pipeline via Voting | Feb 13, 2025 | Decision MakingTask Planning | —Unverified | 0 |
| 3D-Grounded Vision-Language Framework for Robotic Task Planning: Automated Prompt Synthesis and Supervised Reasoning | Feb 13, 2025 | Code GenerationScene Understanding | —Unverified | 0 |
| Robotouille: An Asynchronous Planning Benchmark for LLM Agents | Feb 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Large Language Models for Multi-Robot Systems: A Survey | Feb 6, 2025 | Action GenerationBenchmarking | CodeCode Available | 1 |
| A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs) | Feb 5, 2025 | HallucinationSpatial Reasoning | —Unverified | 0 |
| 3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow | Jan 28, 2025 | Instruction FollowingMixture-of-Experts | —Unverified | 0 |
| PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding | Jan 27, 2025 | BenchmarkingCommon Sense Reasoning | —Unverified | 0 |
| Zero-shot Robotic Manipulation with Language-guided Instruction and Formal Task Planning | Jan 25, 2025 | Motion PlanningTask and Motion Planning | —Unverified | 0 |
| Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation | Jan 21, 2025 | Task Planning | —Unverified | 0 |
| SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning | Jan 17, 2025 | Spatial ReasoningTask Planning | —Unverified | 0 |
| VLM-driven Behavior Tree for Context-aware Task Planning | Jan 7, 2025 | Task Planning | CodeCode Available | 1 |
| Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model | Dec 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning | Dec 27, 2024 | counterfactualHierarchical Reinforcement Learning | —Unverified | 0 |
| A Paragraph is All It Takes: Rich Robot Behaviors from Interacting, Trusted LLMs | Dec 24, 2024 | AllTask Planning | —Unverified | 0 |
| Multi-Modal Grounded Planning and Efficient Replanning For Learning Embodied Agents with A Few Examples | Dec 23, 2024 | Common Sense ReasoningTask Planning | CodeCode Available | 1 |
| GraphAgent: Agentic Graph Language Assistant | Dec 22, 2024 | Knowledge GraphsNode Classification | CodeCode Available | 0 |
| Tree-of-Code: A Hybrid Approach for Robust Complex Task Planning and Execution | Dec 18, 2024 | Code GenerationTask Planning | —Unverified | 0 |
| From An LLM Swarm To A PDDL-Empowered HIVE: Planning Self-Executed Instructions In A Multi-Modal Jungle | Dec 17, 2024 | AI AgentFormal Logic | —Unverified | 0 |
| SafeAgentBench: A Benchmark for Safe Task Planning of Embodied LLM Agents | Dec 17, 2024 | Task Planning | CodeCode Available | 2 |
| Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning | Dec 16, 2024 | HallucinationRobot Manipulation | CodeCode Available | 2 |
| Ontology-driven Prompt Tuning for LLM-based Task and Motion Planning | Dec 10, 2024 | Motion PlanningTask and Motion Planning | —Unverified | 0 |
| HyperGraphOS: A Meta Operating System for Science and Engineering | Dec 6, 2024 | Code GenerationManagement | —Unverified | 0 |
| DataLab: A Unified Platform for LLM-Powered Business Intelligence | Dec 3, 2024 | Large Language ModelTask Planning | —Unverified | 0 |
| RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World | Nov 29, 2024 | Robot Task PlanningScheduling | CodeCode Available | 2 |
| One-Shot Real-to-Sim via End-to-End Differentiable Simulation and Rendering | Nov 29, 2024 | BenchmarkingObject | —Unverified | 0 |
| Time is on my sight: scene graph filtering for dynamic environment perception in an LLM-driven robot | Nov 22, 2024 | Object LocalizationTask Planning | —Unverified | 0 |
| Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation | Nov 18, 2024 | Knowledge GraphsRobot Manipulation | CodeCode Available | 0 |
| VeriGraph: Scene Graphs for Execution Verifiable Robot Planning | Nov 15, 2024 | Robot Task PlanningTask Planning | —Unverified | 0 |
| WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models | Nov 8, 2024 | Task PlanningZero-shot Generalization | CodeCode Available | 2 |