| DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents | Oct 22, 2022 | Autonomous DrivingDialogue Act Classification | CodeCode Available | 1 |
| Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments | Jul 31, 2024 | graph constructionNavigate | CodeCode Available | 1 |
| Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments | Nov 29, 2018 | PositionSpatial Reasoning | CodeCode Available | 1 |
| Diagnosing the Environment Bias in Vision-and-Language Navigation | May 6, 2020 | Vision and Language Navigation | CodeCode Available | 1 |
| Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments | Nov 20, 2017 | Reinforcement LearningTranslation | CodeCode Available | 1 |
| Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts | Jun 4, 2024 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| ESceme: Vision-and-Language Navigation with Episodic Scene Memory | Mar 2, 2023 | Vision and Language Navigation | CodeCode Available | 1 |
| A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation | May 5, 2023 | Vision and Language Navigation | CodeCode Available | 1 |
| Self-Monitoring Navigation Agent via Auxiliary Progress Estimation | Jan 10, 2019 | Natural Language Visual GroundingVision and Language Navigation | CodeCode Available | 1 |
| Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas | Mar 25, 2022 | DiversityVision and Language Navigation | CodeCode Available | 1 |
| SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments | Aug 26, 2021 | Vision and Language Navigation | CodeCode Available | 1 |
| VALAN: Vision and Language Agent Navigation | Dec 6, 2019 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 1 |
| Simple and Effective Synthesis of Indoor 3D Scenes | Apr 6, 2022 | Data AugmentationVision and Language Navigation | CodeCode Available | 1 |
| Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View | Jan 10, 2020 | Vision and Language Navigation | CodeCode Available | 1 |
| Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation | Jul 23, 2021 | Vision and Language NavigationVision-Language Navigation | CodeCode Available | 1 |
| Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding | Oct 15, 2020 | Vision and Language Navigation | CodeCode Available | 1 |
| Sim-to-Real Transfer for Vision-and-Language Navigation | Nov 7, 2020 | Vision and Language Navigation | CodeCode Available | 1 |
| Neighbor-view Enhanced Model for Vision and Language Navigation | Jul 15, 2021 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones | Feb 14, 2022 | Vision and Language Navigation | CodeCode Available | 1 |
| Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation | Nov 10, 2021 | DecoderNavigate | CodeCode Available | 1 |
| g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks | Nov 26, 2024 | Contrastive LearningQuestion Answering | CodeCode Available | 1 |
| FedVLN: Privacy-preserving Federated Vision-and-Language Navigation | Mar 28, 2022 | Privacy PreservingVision and Language Navigation | CodeCode Available | 1 |
| PRET: Planning with Directed Fidelity Trajectory for Vision and Language Navigation | Jul 16, 2024 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Reinforced Structured State-Evolution for Vision-Language Navigation | Apr 20, 2022 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| A Recurrent Vision-and-Language BERT for Navigation | Nov 26, 2020 | Decision MakingDecoder | CodeCode Available | 1 |
| GridMM: Grid Memory Map for Vision-and-Language Navigation | Jul 24, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation | Aug 24, 2023 | cross-modal alignmentDescriptive | CodeCode Available | 1 |
| Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation | Jul 1, 2020 | Style TransferText Style Transfer | CodeCode Available | 1 |
| Pathdreamer: A World Model for Indoor Navigation | May 18, 2021 | modelSemantic Segmentation | CodeCode Available | 1 |
| Language and Visual Entity Relationship Graph for Agent Navigation | Oct 19, 2020 | Dynamic Time WarpingNavigate | CodeCode Available | 1 |
| Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision | Dec 1, 2021 | cross-modal alignmentNavigate | CodeCode Available | 1 |
| Explore the Potential Performance of Vision-and-Language Navigation Model: a Snapshot Ensemble Method | Nov 28, 2021 | Vision and Language Navigation | —Unverified | 0 |
| Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation | Jul 11, 2020 | Decision MakingImitation Learning | —Unverified | 0 |
| Explicit Object Relation Alignment for Vision and Language Navigation | Nov 16, 2021 | Instruction FollowingRelation | —Unverified | 0 |
| A^2Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models | Aug 15, 2023 | NavigateRobot Navigation | —Unverified | 0 |
| Evaluating Explanation Methods for Vision-and-Language Navigation | Oct 10, 2023 | Decision MakingNavigate | —Unverified | 0 |
| Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation | Mar 6, 2024 | Representation LearningVision and Language Navigation | —Unverified | 0 |
| Endowing Embodied Agents with Spatial Reasoning Capabilities for Vision-and-Language Navigation | Apr 9, 2025 | HallucinationSpatial Reasoning | —Unverified | 0 |
| Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments – Extended Abstract | Jun 12, 2020 | Vision and Language Navigation | —Unverified | 0 |
| Do Visual Imaginations Improve Vision-and-Language Navigation Agents? | Mar 20, 2025 | Vision and Language Navigation | —Unverified | 0 |
| AIGeN: An Adversarial Approach for Instruction Generation in VLN | Apr 15, 2024 | DecoderVision and Language Navigation | —Unverified | 0 |
| DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation | Apr 30, 2025 | NavigateObject | —Unverified | 0 |
| Does VLN Pretraining Work with Nonsensical or Irrelevant Instructions? | Nov 28, 2023 | Data AugmentationTranslation | —Unverified | 0 |
| Just Ask:An Interactive Learning Framework for Vision and Language Navigation | Dec 2, 2019 | Continual LearningData Augmentation | —Unverified | 0 |
| Disrupting Vision-Language Model-Driven Navigation Services via Adversarial Object Fusion | May 29, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Diagnosing Vision-and-Language Navigation: What Really Matters | Dec 17, 2021 | DiagnosticObject | —Unverified | 0 |
| MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation | Jan 14, 2024 | Decision MakingVision and Language Navigation | —Unverified | 0 |
| Masked Path Modeling for Vision-and-Language Navigation | May 23, 2023 | Action GenerationNavigate | —Unverified | 0 |
| IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation | Mar 28, 2024 | AttributeLanguage Modelling | —Unverified | 0 |
| Iterative Vision-and-Language Navigation | Oct 6, 2022 | Instruction FollowingVision and Language Navigation | —Unverified | 0 |