| WebVLN: Vision-and-Language Navigation on Websites | Dec 25, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 | 5 |
| Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments | Jul 31, 2024 | graph constructionNavigate | CodeCode Available | 1 | 5 |
| Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation | Mar 5, 2022 | Imitation LearningVision and Language Navigation | CodeCode Available | 1 | 5 |
| Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments | Nov 29, 2018 | PositionSpatial Reasoning | CodeCode Available | 1 | 5 |
| Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision | Dec 1, 2021 | cross-modal alignmentNavigate | CodeCode Available | 1 | 5 |
| Diagnosing the Environment Bias in Vision-and-Language Navigation | May 6, 2020 | Vision and Language Navigation | CodeCode Available | 1 | 5 |
| Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation | Nov 10, 2021 | DecoderNavigate | CodeCode Available | 1 | 5 |
| A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation | May 5, 2023 | Vision and Language Navigation | CodeCode Available | 1 | 5 |
| CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory | May 8, 2025 | Large Language ModelNavigate | CodeCode Available | 1 | 5 |
| Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas | Mar 25, 2022 | DiversityVision and Language Navigation | CodeCode Available | 1 | 5 |
| GridMM: Grid Memory Map for Vision-and-Language Navigation | Jul 24, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 | 5 |
| Neighbor-view Enhanced Model for Vision and Language Navigation | Jul 15, 2021 | NavigateVision and Language Navigation | CodeCode Available | 1 | 5 |
| A Recurrent Vision-and-Language BERT for Navigation | Nov 26, 2020 | Decision MakingDecoder | CodeCode Available | 1 | 5 |
| MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation for Effective-and-Efficient Vision-and-Language Navigation | Jun 25, 2024 | Knowledge DistillationTest unseen | CodeCode Available | 1 | 5 |
| Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation | Jul 23, 2021 | Vision and Language NavigationVision-Language Navigation | CodeCode Available | 1 | 5 |
| Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments | Nov 20, 2017 | Reinforcement LearningTranslation | CodeCode Available | 1 | 5 |
| VLN-Trans: Translator for the Vision and Language Navigation Agent | Feb 18, 2023 | Vision and Language Navigation | CodeCode Available | 1 | 5 |
| VALAN: Vision and Language Agent Navigation | Dec 6, 2019 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding | Oct 15, 2020 | Vision and Language Navigation | CodeCode Available | 1 | 5 |
| Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training | Feb 25, 2020 | NavigateSelf-Supervised Learning | CodeCode Available | 1 | 5 |
| g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks | Nov 26, 2024 | Contrastive LearningQuestion Answering | CodeCode Available | 1 | 5 |
| VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View | Jul 12, 2023 | Decision MakingNatural Language Understanding | CodeCode Available | 1 | 5 |
| Learning Navigational Visual Representations with Semantic Map Supervision | Jul 23, 2023 | Representation LearningSelf-Supervised Learning | CodeCode Available | 1 | 5 |
| March in Chat: Interactive Prompting for Remote Embodied Referring Expression | Aug 20, 2023 | Referring ExpressionVision and Language Navigation | CodeCode Available | 1 | 5 |
| FedVLN: Privacy-preserving Federated Vision-and-Language Navigation | Mar 28, 2022 | Privacy PreservingVision and Language Navigation | CodeCode Available | 1 | 5 |
| Learning Vision-and-Language Navigation from YouTube Videos | Jul 22, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 | 5 |
| Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation | Aug 24, 2023 | cross-modal alignmentDescriptive | CodeCode Available | 1 | 5 |
| Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation | Jul 1, 2020 | Style TransferText Style Transfer | CodeCode Available | 1 | 5 |
| Language and Visual Entity Relationship Graph for Agent Navigation | Oct 19, 2020 | Dynamic Time WarpingNavigate | CodeCode Available | 1 | 5 |
| Learning from Unlabeled 3D Environments for Vision-and-Language Navigation | Aug 24, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation | Oct 14, 2022 | NavigateVision and Language Navigation | CodeCode Available | 1 | 5 |
| Explicit Object Relation Alignment for Vision and Language Navigation | May 1, 2022 | ObjectRelation | CodeCode Available | 0 | 5 |
| The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation | Mar 5, 2019 | Decision MakingVision and Language Navigation | CodeCode Available | 0 | 5 |
| The Regretful Navigation Agent for Vision-and-Language Navigation | Mar 5, 2019 | Decision MakingVision and Language Navigation | CodeCode Available | 0 | 5 |
| Chasing Ghosts: Instruction Following as Bayesian State Tracking | Jul 3, 2019 | Instruction FollowingVision and Language Navigation | CodeCode Available | 0 | 5 |
| Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation | Mar 6, 2019 | Vision and Language NavigationVision-Language Navigation | CodeCode Available | 0 | 5 |
| Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters | Jul 5, 2019 | Vision and Language Navigation | CodeCode Available | 0 | 5 |
| Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation | Sep 9, 2024 | Vision and Language Navigation | CodeCode Available | 0 | 5 |
| Speaker-Follower Models for Vision-and-Language Navigation | Jun 7, 2018 | Data AugmentationVision and Language Navigation | CodeCode Available | 0 | 5 |
| Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for Navigation Instruction Generation | Jul 25, 2023 | Vision and Language Navigation | CodeCode Available | 0 | 5 |
| Diagnosing Vision-and-Language Navigation: What Really Matters | Mar 30, 2021 | DiagnosticObject | CodeCode Available | 0 | 5 |
| Into the Unknown: Generating Geospatial Descriptions for New Environments | Jun 28, 2024 | Language ModellingLarge Language Model | CodeCode Available | 0 | 5 |
| REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments | Apr 23, 2019 | Referring ExpressionVision and Language Navigation | CodeCode Available | 0 | 5 |
| Behavioral Analysis of Vision-and-Language Navigation Agents | Jul 20, 2023 | Vision and Language Navigation | CodeCode Available | 0 | 5 |
| DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning | Apr 2, 2024 | Contrastive LearningDecision Making | CodeCode Available | 0 | 5 |
| Multimodal Attention Networks for Low-Level Vision-and-Language Navigation | Nov 27, 2019 | Vision and Language Navigation | CodeCode Available | 0 | 5 |
| Augmented Commonsense Knowledge for Remote Object Grounding | Jun 3, 2024 | Decision MakingObject | CodeCode Available | 0 | 5 |
| Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation | Mar 18, 2024 | Common Sense ReasoningEfficient Exploration | CodeCode Available | 0 | 5 |
| NavHint: Vision and Language Navigation Agent with a Hint Generator | Feb 4, 2024 | Vision and Language Navigation | CodeCode Available | 0 | 5 |
| Ground then Navigate: Language-guided Navigation in Dynamic Scenes | Sep 24, 2022 | Autonomous DrivingNavigate | CodeCode Available | 0 | 5 |