| A Recurrent Vision-and-Language BERT for Navigation | Nov 26, 2020 | Decision MakingDecoder | CodeCode Available | 1 | 5 |
| Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation | Jul 1, 2020 | Style TransferText Style Transfer | CodeCode Available | 1 | 5 |
| How Much Can CLIP Benefit Vision-and-Language Tasks? | Jul 13, 2021 | Question AnsweringVision and Language Navigation | CodeCode Available | 1 | 5 |
| History Aware Multimodal Transformer for Vision-and-Language Navigation | Oct 25, 2021 | Decision MakingNavigate | CodeCode Available | 1 | 5 |
| Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation | Jul 23, 2021 | Vision and Language NavigationVision-Language Navigation | CodeCode Available | 1 | 5 |
| Neighbor-view Enhanced Model for Vision and Language Navigation | Jul 15, 2021 | NavigateVision and Language Navigation | CodeCode Available | 1 | 5 |
| Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation | Aug 24, 2023 | cross-modal alignmentDescriptive | CodeCode Available | 1 | 5 |
| GridMM: Grid Memory Map for Vision-and-Language Navigation | Jul 24, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 | 5 |
| FedVLN: Privacy-preserving Federated Vision-and-Language Navigation | Mar 28, 2022 | Privacy PreservingVision and Language Navigation | CodeCode Available | 1 | 5 |
| Language and Visual Entity Relationship Graph for Agent Navigation | Oct 19, 2020 | Dynamic Time WarpingNavigate | CodeCode Available | 1 | 5 |