| WebLINX: Real-World Website Navigation with Multi-Turn Dialogue | Feb 8, 2024 | Conversational Web NavigationText Generation | CodeCode Available | 5 |
| NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models | Jul 17, 2024 | Instruction FollowingVision and Language Navigation | CodeCode Available | 3 |
| Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models | Jul 9, 2024 | Vision and Language Navigation | CodeCode Available | 3 |
| NavRAG: Generating User Demand Instructions for Embodied Navigation through Retrieval-Augmented LLM | Feb 16, 2025 | NavigateRAG | CodeCode Available | 2 |
| NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning | Mar 12, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| FlightGPT: Towards Generalizable and Interpretable UAV Vision-and-Language Navigation with Vision-Language Models | May 19, 2025 | Disaster ResponseVision and Language Navigation | CodeCode Available | 2 |
| BEVBert: Multimodal Map Pre-training for Language-guided Navigation | Dec 8, 2022 | Vision and Language NavigationVisual Navigation | CodeCode Available | 2 |
| Vision-and-Language Navigation via Causal Learning | Apr 16, 2024 | Causal InferenceContrastive Learning | CodeCode Available | 2 |
| 1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022) | Jun 23, 2022 | Data AugmentationVision and Language Navigation | CodeCode Available | 2 |
| Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions | Mar 22, 2022 | Vision and Language Navigation | CodeCode Available | 2 |
| FLAME: Learning to Navigate with Multimodal LLM in Urban Environments | Aug 20, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information | Jun 20, 2024 | Vision and Language Navigation | CodeCode Available | 2 |
| NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models | May 26, 2023 | Instruction FollowingVision and Language Navigation | CodeCode Available | 2 |
| General Scene Adaptation for Vision-and-Language Navigation | Jan 29, 2025 | DiversityVision and Language Navigation | CodeCode Available | 2 |
| Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation | May 16, 2025 | 3D geometryNavigate | CodeCode Available | 2 |
| Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions | Jun 27, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation | Jun 14, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation | Feb 23, 2022 | Efficient ExplorationNavigate | CodeCode Available | 2 |
| NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments | Jun 30, 2025 | Decision MakingVision and Language Navigation | CodeCode Available | 2 |
| AerialVLN: Vision-and-Language Navigation for UAVs | Aug 13, 2023 | cross-modal alignmentNavigate | CodeCode Available | 2 |
| Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation | Apr 2, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| Scaling Data Generation in Vision-and-Language Navigation | Jul 28, 2023 | Imitation LearningVision and Language Navigation | CodeCode Available | 2 |
| Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas | Mar 25, 2022 | DiversityVision and Language Navigation | CodeCode Available | 1 |
| Airbert: In-domain Pretraining for Vision-and-Language Navigation | Aug 20, 2021 | NavigateReferring Expression | CodeCode Available | 1 |
| Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation | Nov 10, 2021 | DecoderNavigate | CodeCode Available | 1 |
| Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation | Mar 5, 2022 | Imitation LearningVision and Language Navigation | CodeCode Available | 1 |
| Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments | Jul 31, 2024 | graph constructionNavigate | CodeCode Available | 1 |
| A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation | May 5, 2023 | Vision and Language Navigation | CodeCode Available | 1 |
| CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory | May 8, 2025 | Large Language ModelNavigate | CodeCode Available | 1 |
| March in Chat: Interactive Prompting for Remote Embodied Referring Expression | Aug 20, 2023 | Referring ExpressionVision and Language Navigation | CodeCode Available | 1 |
| Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments | Apr 6, 2020 | Vision and Language Navigation | CodeCode Available | 1 |
| Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation | Dec 9, 2024 | Object LocalizationVision and Language Navigation | CodeCode Available | 1 |
| ESceme: Vision-and-Language Navigation with Episodic Scene Memory | Mar 2, 2023 | Vision and Language Navigation | CodeCode Available | 1 |
| MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation for Effective-and-Efficient Vision-and-Language Navigation | Jun 25, 2024 | Knowledge DistillationTest unseen | CodeCode Available | 1 |
| Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation | Jul 1, 2020 | Style TransferText Style Transfer | CodeCode Available | 1 |
| Language and Visual Entity Relationship Graph for Agent Navigation | Oct 19, 2020 | Dynamic Time WarpingNavigate | CodeCode Available | 1 |
| Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision | Dec 1, 2021 | cross-modal alignmentNavigate | CodeCode Available | 1 |
| Learning from Unlabeled 3D Environments for Vision-and-Language Navigation | Aug 24, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Improving Vision-and-Language Navigation with Image-Text Pairs from the Web | Apr 30, 2020 | Vision and Language Navigation | CodeCode Available | 1 |
| Cross-modal Map Learning for Vision and Language Navigation | Mar 10, 2022 | Vision and Language Navigation | CodeCode Available | 1 |
| KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation | Mar 28, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation | May 27, 2025 | Large Language ModelLogical Reasoning | CodeCode Available | 1 |
| EnvEdit: Environment Editing for Vision-and-Language Navigation | Mar 29, 2022 | Data AugmentationDiversity | CodeCode Available | 1 |
| DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents | Oct 22, 2022 | Autonomous DrivingDialogue Act Classification | CodeCode Available | 1 |
| Diagnosing the Environment Bias in Vision-and-Language Navigation | May 6, 2020 | Vision and Language Navigation | CodeCode Available | 1 |
| BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps | May 10, 2020 | Imitation LearningNavigate | CodeCode Available | 1 |
| The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation | Apr 9, 2021 | Vision and Language NavigationVision-Language Navigation | CodeCode Available | 1 |
| Learning Navigational Visual Representations with Semantic Map Supervision | Jul 23, 2023 | Representation LearningSelf-Supervised Learning | CodeCode Available | 1 |
| History Aware Multimodal Transformer for Vision-and-Language Navigation | Oct 25, 2021 | Decision MakingNavigate | CodeCode Available | 1 |
| A Recurrent Vision-and-Language BERT for Navigation | Nov 26, 2020 | Decision MakingDecoder | CodeCode Available | 1 |