| Behavioral Analysis of Vision-and-Language Navigation Agents | Jul 20, 2023 | Vision and Language Navigation | CodeCode Available | 0 |
| VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View | Jul 12, 2023 | Decision MakingNatural Language Understanding | CodeCode Available | 1 |
| CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation | Jun 17, 2023 | Decision MakingInstruction Following | —Unverified | 0 |
| PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation | May 30, 2023 | Image OutpaintingLanguage Modelling | —Unverified | 0 |
| GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation | May 26, 2023 | Vision and Language Navigation | CodeCode Available | 0 |
| NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models | May 26, 2023 | Instruction FollowingVision and Language Navigation | CodeCode Available | 2 |
| Masked Path Modeling for Vision-and-Language Navigation | May 23, 2023 | Action GenerationNavigate | —Unverified | 0 |
| PASTS: Progress-Aware Spatio-Temporal Transformer Speaker For Vision-and-Language Navigation | May 19, 2023 | Data AugmentationVision and Language Navigation | —Unverified | 0 |
| A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation | May 5, 2023 | Vision and Language Navigation | CodeCode Available | 1 |
| Improving Vision-and-Language Navigation by Generating Future-View Image Semantics | Apr 11, 2023 | Image GenerationNavigate | —Unverified | 0 |
| KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation | Mar 28, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| HOP+: History-enhanced and Order-aware Pre-training for Vision-and-Language Navigation | Mar 20, 2023 | Decision MakingLanguage Modeling | —Unverified | 0 |
| Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding | Mar 7, 2023 | Vision and Language NavigationVisual Navigation | —Unverified | 0 |
| MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation | Mar 2, 2023 | NavigateVision and Language Navigation | CodeCode Available | 0 |
| ESceme: Vision-and-Language Navigation with Episodic Scene Memory | Mar 2, 2023 | Vision and Language Navigation | CodeCode Available | 1 |
| VLN-Trans: Translator for the Vision and Language Navigation Agent | Feb 18, 2023 | Vision and Language Navigation | CodeCode Available | 1 |
| Graph based Environment Representation for Vision-and-Language Navigation in Continuous Environments | Jan 11, 2023 | Objectobject-detection | —Unverified | 0 |
| BEVBert: Multimodal Map Pre-training for Language-guided Navigation | Dec 8, 2022 | Vision and Language NavigationVisual Navigation | CodeCode Available | 2 |
| CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation | Nov 30, 2022 | DiversityInstruction Following | —Unverified | 0 |
| Navigation as Attackers Wish? Towards Building Robust Embodied Agents under Federated Learning | Nov 27, 2022 | Federated LearningNavigate | —Unverified | 0 |
| Structure-Encoding Auxiliary Tasks for Improved Visual Representation in Vision-and-Language Navigation | Nov 20, 2022 | Test unseenVision and Language Navigation | —Unverified | 0 |
| DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents | Oct 22, 2022 | Autonomous DrivingDialogue Act Classification | CodeCode Available | 1 |
| ULN: Towards Underspecified Vision-and-Language Navigation | Oct 18, 2022 | Vision and Language Navigation | CodeCode Available | 0 |
| Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation | Oct 14, 2022 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Iterative Vision-and-Language Navigation | Oct 6, 2022 | Instruction FollowingVision and Language Navigation | —Unverified | 0 |
| A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning | Oct 6, 2022 | Imitation LearningInstruction Following | —Unverified | 0 |
| LOViS: Learning Orientation and Visual Signals for Vision and Language Navigation | Sep 26, 2022 | Spatial ReasoningVision and Language Navigation | CodeCode Available | 0 |
| Ground then Navigate: Language-guided Navigation in Dynamic Scenes | Sep 24, 2022 | Autonomous DrivingNavigate | CodeCode Available | 0 |
| Anticipating the Unseen Discrepancy for Vision and Language Navigation | Sep 10, 2022 | Data AugmentationDecision Making | —Unverified | 0 |
| Learning from Unlabeled 3D Environments for Vision-and-Language Navigation | Aug 24, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cues | Jul 24, 2022 | cross-modal alignmentTrajectory Planning | CodeCode Available | 0 |
| CLEAR: Improving Vision-Language Navigation with Cross-Lingual, Environment-Agnostic Representations | Jul 5, 2022 | NavigateRepresentation Learning | CodeCode Available | 0 |
| 1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022) | Jun 23, 2022 | Data AugmentationVision and Language Navigation | CodeCode Available | 2 |
| Local Slot Attention for Vision-and-Language Navigation | Jun 17, 2022 | NavigateVision and Language Navigation | CodeCode Available | 0 |
| FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation | Jun 9, 2022 | Vision and Language Navigation | CodeCode Available | 0 |
| Explicit Object Relation Alignment for Vision and Language Navigation | May 1, 2022 | ObjectRelation | CodeCode Available | 0 |
| Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments | Apr 20, 2022 | NavigateVision and Language Navigation | —Unverified | 0 |
| Reinforced Structured State-Evolution for Vision-Language Navigation | Apr 20, 2022 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Simple and Effective Synthesis of Indoor 3D Scenes | Apr 6, 2022 | Data AugmentationVision and Language Navigation | CodeCode Available | 1 |
| EnvEdit: Environment Editing for Vision-and-Language Navigation | Mar 29, 2022 | Data AugmentationDiversity | CodeCode Available | 1 |
| FedVLN: Privacy-preserving Federated Vision-and-Language Navigation | Mar 28, 2022 | Privacy PreservingVision and Language Navigation | CodeCode Available | 1 |
| Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas | Mar 25, 2022 | DiversityVision and Language Navigation | CodeCode Available | 1 |
| Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions | Mar 22, 2022 | Vision and Language Navigation | CodeCode Available | 2 |
| HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation | Mar 22, 2022 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Cross-modal Map Learning for Vision and Language Navigation | Mar 10, 2022 | Vision and Language Navigation | CodeCode Available | 1 |
| Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation | Mar 5, 2022 | Imitation LearningVision and Language Navigation | CodeCode Available | 1 |
| Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation | Feb 23, 2022 | Efficient ExplorationNavigate | CodeCode Available | 2 |
| One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones | Feb 14, 2022 | Vision and Language Navigation | CodeCode Available | 1 |
| Self-supervised 3D Semantic Representation Learning for Vision-and-Language Navigation | Jan 26, 2022 | Representation LearningTest unseen | —Unverified | 0 |
| Explore the Potential Performance of Vision-and-Language Navigation Model: a Snapshot Ensemble Method | Jan 16, 2022 | Vision and Language Navigation | —Unverified | 0 |