| Into the Unknown: Generating Geospatial Descriptions for New Environments | Jun 28, 2024 | Language ModellingLarge Language Model | CodeCode Available | 0 |
| Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions | Jun 27, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation for Effective-and-Efficient Vision-and-Language Navigation | Jun 25, 2024 | Knowledge DistillationTest unseen | CodeCode Available | 1 |
| CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information | Jun 20, 2024 | Vision and Language Navigation | CodeCode Available | 2 |
| Contrast Sets for Evaluating Language-Guided Robot Policies | Jun 19, 2024 | Vision and Language Navigation | —Unverified | 0 |
| Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation | Jun 14, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| I2EDL: Interactive Instruction Error Detection and Localization | Jun 7, 2024 | Vision and Language Navigation | —Unverified | 0 |
| Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts | Jun 4, 2024 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Augmented Commonsense Knowledge for Remote Object Grounding | Jun 3, 2024 | Decision MakingObject | CodeCode Available | 0 |
| Vision-and-Language Navigation Generative Pretrained Transformer | May 27, 2024 | DecoderImitation Learning | —Unverified | 0 |
| MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains | May 17, 2024 | DiversityNavigate | —Unverified | 0 |
| Vision-and-Language Navigation via Causal Learning | Apr 16, 2024 | Causal InferenceContrastive Learning | CodeCode Available | 2 |
| AIGeN: An Adversarial Approach for Instruction Generation in VLN | Apr 15, 2024 | DecoderVision and Language Navigation | —Unverified | 0 |
| DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning | Apr 2, 2024 | Contrastive LearningDecision Making | CodeCode Available | 0 |
| Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation | Apr 2, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation | Mar 28, 2024 | AttributeLanguage Modelling | —Unverified | 0 |
| Scaling Vision-and-Language Navigation With Offline RL | Mar 27, 2024 | Offline RLVision and Language Navigation | —Unverified | 0 |
| OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation | Mar 26, 2024 | Vision and Language Navigation | —Unverified | 0 |
| Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation | Mar 23, 2024 | NavigateObject | —Unverified | 0 |
| Continual Vision-and-Language Navigation | Mar 22, 2024 | Continual LearningNavigate | —Unverified | 0 |
| Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation | Mar 18, 2024 | Common Sense ReasoningEfficient Exploration | CodeCode Available | 0 |
| Mind the Error! Detection and Localization of Instruction Errors in Vision-and-Language Navigation | Mar 15, 2024 | NavigateVision and Language Navigation | —Unverified | 0 |
| NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning | Mar 12, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning | Mar 9, 2024 | Contrastive LearningNavigate | —Unverified | 0 |
| Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation | Mar 6, 2024 | Representation LearningVision and Language Navigation | —Unverified | 0 |
| NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation | Feb 24, 2024 | Decision MakingInstruction Following | —Unverified | 0 |
| WebLINX: Real-World Website Navigation with Multi-Turn Dialogue | Feb 8, 2024 | Conversational Web NavigationText Generation | CodeCode Available | 5 |
| VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation | Feb 5, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| NavHint: Vision and Language Navigation Agent with a Hint Generator | Feb 4, 2024 | Vision and Language Navigation | CodeCode Available | 0 |
| MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation | Jan 14, 2024 | Decision MakingVision and Language Navigation | —Unverified | 0 |
| WebVLN: Vision-and-Language Navigation on Websites | Dec 25, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model | Nov 30, 2023 | Vision and Language Navigation | —Unverified | 0 |
| DAP: Domain-aware Prompt Learning for Vision-and-Language Navigation | Nov 29, 2023 | cross-modal alignmentNavigate | —Unverified | 0 |
| Does VLN Pretraining Work with Nonsensical or Irrelevant Instructions? | Nov 28, 2023 | Data AugmentationTranslation | —Unverified | 0 |
| Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation | Nov 22, 2023 | NavigateTest-time Adaptation | CodeCode Available | 1 |
| Vision and Language Navigation in the Real World via Online Visual Language Mapping | Oct 16, 2023 | Vision and Language Navigation | —Unverified | 0 |
| LangNav: Language as a Perceptual Representation for Navigation | Oct 11, 2023 | Image CaptioningLanguage Modeling | —Unverified | 0 |
| Evaluating Explanation Methods for Vision-and-Language Navigation | Oct 10, 2023 | Decision MakingNavigate | —Unverified | 0 |
| Prompt-based Context- and Domain-aware Pretraining for Vision and Language Navigation | Sep 7, 2023 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation | Aug 24, 2023 | cross-modal alignmentDescriptive | CodeCode Available | 1 |
| VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation | Aug 20, 2023 | Transfer LearningVision and Language Navigation | CodeCode Available | 0 |
| March in Chat: Interactive Prompting for Remote Embodied Referring Expression | Aug 20, 2023 | Referring ExpressionVision and Language Navigation | CodeCode Available | 1 |
| A^2Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models | Aug 15, 2023 | NavigateRobot Navigation | —Unverified | 0 |
| AerialVLN: Vision-and-Language Navigation for UAVs | Aug 13, 2023 | cross-modal alignmentNavigate | CodeCode Available | 2 |
| Mind the Gap: Improving Success Rate of Vision-and-Language Navigation by Revisiting Oracle Success Routes | Aug 7, 2023 | NavigateVision and Language Navigation | —Unverified | 0 |
| Scaling Data Generation in Vision-and-Language Navigation | Jul 28, 2023 | Imitation LearningVision and Language Navigation | CodeCode Available | 2 |
| Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for Navigation Instruction Generation | Jul 25, 2023 | Vision and Language Navigation | CodeCode Available | 0 |
| GridMM: Grid Memory Map for Vision-and-Language Navigation | Jul 24, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Learning Navigational Visual Representations with Semantic Map Supervision | Jul 23, 2023 | Representation LearningSelf-Supervised Learning | CodeCode Available | 1 |
| Learning Vision-and-Language Navigation from YouTube Videos | Jul 22, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 |