| Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities | Jul 17, 2025 | Large Language ModelVision and Language Navigation | —Unverified | 0 |
| NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments | Jun 30, 2025 | Decision MakingVision and Language Navigation | CodeCode Available | 2 |
| Grounded Vision-Language Navigation for UAVs with Open-Vocabulary Goal Understanding | Jun 12, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| A Navigation Framework Utilizing Vision-Language Models | Jun 11, 2025 | NavigatePrompt Engineering | CodeCode Available | 0 |
| Disrupting Vision-Language Model-Driven Navigation Services via Adversarial Object Fusion | May 29, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation | May 27, 2025 | Large Language ModelLogical Reasoning | CodeCode Available | 1 |
| FlightGPT: Towards Generalizable and Interpretable UAV Vision-and-Language Navigation with Vision-Language Models | May 19, 2025 | Disaster ResponseVision and Language Navigation | CodeCode Available | 2 |
| Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation | May 16, 2025 | 3D geometryNavigate | CodeCode Available | 2 |
| CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory | May 8, 2025 | Large Language ModelNavigate | CodeCode Available | 1 |
| MetaScenes: Towards Automated Replica Creation for Real-world 3D Scans | May 5, 2025 | Vision and Language Navigation | —Unverified | 0 |
| DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation | Apr 30, 2025 | NavigateObject | —Unverified | 0 |
| ST-Booster: An Iterative SpatioTemporal Perception Booster for Vision-and-Language Navigation in Continuous Environments | Apr 14, 2025 | NavigateVision and Language Navigation | —Unverified | 0 |
| Endowing Embodied Agents with Spatial Reasoning Capabilities for Vision-and-Language Navigation | Apr 9, 2025 | HallucinationSpatial Reasoning | —Unverified | 0 |
| COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation | Mar 31, 2025 | MemorizationVision and Language Navigation | —Unverified | 0 |
| Do Visual Imaginations Improve Vision-and-Language Navigation Agents? | Mar 20, 2025 | Vision and Language Navigation | —Unverified | 0 |
| FlexVLN: Flexible Adaptation for Diverse Vision-and-Language Navigation Tasks | Mar 18, 2025 | Vision and Language Navigation | —Unverified | 0 |
| HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard | Mar 18, 2025 | BenchmarkingHuman Dynamics | —Unverified | 0 |
| Aerial Vision-and-Language Navigation with Grid-based View Selection and Map Construction | Mar 14, 2025 | NavigateVision and Language Navigation | —Unverified | 0 |
| Observation-Graph Interaction and Key-Detail Guidance for Vision and Language Navigation | Mar 14, 2025 | cross-modal alignmentNavigate | —Unverified | 0 |
| PanoGen++: Domain-Adapted Text-Guided Panoramic Environment Generation for Vision-and-Language Navigation | Mar 13, 2025 | Image InpaintingImage Outpainting | —Unverified | 0 |
| SmartWay: Enhanced Waypoint Prediction and Backtracking for Zero-Shot Vision-and-Language Navigation | Mar 13, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Ground-level Viewpoint Vision-and-Language Navigation in Continuous Environments | Feb 26, 2025 | Instruction FollowingVision and Language Navigation | —Unverified | 0 |
| NavRAG: Generating User Demand Instructions for Embodied Navigation through Retrieval-Augmented LLM | Feb 16, 2025 | NavigateRAG | CodeCode Available | 2 |
| TRAVEL: Training-Free Retrieval and Alignment for Vision-and-Language Navigation | Feb 11, 2025 | RetrievalVision and Language Navigation | —Unverified | 0 |
| General Scene Adaptation for Vision-and-Language Navigation | Jan 29, 2025 | DiversityVision and Language Navigation | CodeCode Available | 2 |
| Language and Planning in Robotic Navigation: A Multilingual Evaluation of State-of-the-Art Models | Jan 7, 2025 | Instruction FollowingVision and Language Navigation | —Unverified | 0 |
| NAVCON: A Cognitively Inspired and Linguistically Grounded Corpus for Vision and Language Navigation | Dec 17, 2024 | Few-Shot LearningVision and Language Navigation | —Unverified | 0 |
| RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation | Dec 11, 2024 | 3D ReconstructionDiversity | —Unverified | 0 |
| Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation | Dec 9, 2024 | Object LocalizationVision and Language Navigation | CodeCode Available | 1 |
| World-Consistent Data Generation for Vision-and-Language Navigation | Dec 9, 2024 | Data AugmentationNavigate | —Unverified | 0 |
| NaVILA: Legged Robot Vision-Language-Action Model for Navigation | Dec 5, 2024 | NavigateVision and Language Navigation | —Unverified | 0 |
| Hijacking Vision-and-Language Navigation Agents with Adversarial Environmental Attacks | Dec 3, 2024 | Adversarial AttackVision and Language Navigation | —Unverified | 0 |
| Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation | Nov 30, 2024 | NavigateVision and Language Navigation | —Unverified | 0 |
| g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks | Nov 26, 2024 | Contrastive LearningQuestion Answering | CodeCode Available | 1 |
| UnitedVLN: Generalizable Gaussian Splatting for Continuous Vision-Language Navigation | Nov 25, 2024 | 3DGSNavigate | —Unverified | 0 |
| Fine-Grained Alignment in Vision-and-Language Navigation through Bayesian Optimization | Nov 22, 2024 | Bayesian OptimizationContrastive Learning | —Unverified | 0 |
| NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied Vision-and-Language Navigation | Nov 13, 2024 | NavigateVision and Language Navigation | —Unverified | 0 |
| Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Zero-Shot Vision-and-Language Navigation with Collision Mitigation in Continuous Environment | Oct 7, 2024 | Large Language ModelVision and Language Navigation | —Unverified | 0 |
| MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation | Sep 27, 2024 | Knowledge DistillationVision and Language Navigation | —Unverified | 0 |
| Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs | Sep 27, 2024 | Decision MakingNavigate | —Unverified | 0 |
| Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation | Sep 9, 2024 | Vision and Language Navigation | CodeCode Available | 0 |
| Seeing is Believing? Enhancing Vision-Language Navigation using Visual Perturbations | Sep 9, 2024 | Autonomous NavigationDiversity | —Unverified | 0 |
| FLAME: Learning to Navigate with Multimodal LLM in Urban Environments | Aug 20, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| Narrowing the Gap between Vision and Action in Navigation | Aug 19, 2024 | DecoderSpatial Reasoning | CodeCode Available | 0 |
| Loc4Plan: Locating Before Planning for Outdoor Vision and Language Navigation | Aug 9, 2024 | NavigatePosition | —Unverified | 0 |
| Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments | Jul 31, 2024 | graph constructionNavigate | CodeCode Available | 1 |
| NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models | Jul 17, 2024 | Instruction FollowingVision and Language Navigation | CodeCode Available | 3 |
| PRET: Planning with Directed Fidelity Trajectory for Vision and Language Navigation | Jul 16, 2024 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models | Jul 9, 2024 | Vision and Language Navigation | CodeCode Available | 3 |