Explicit Object Relation Alignment for Vision and Language Navigation

2022-05-01ACL 2022Code Available0· sign in to hype

Yue Zhang, Parisa Kordjamshidi

Code Available — Be the first to reproduce this paper.

Code

github.com/hlr/object-grounding-for-vln
OfficialIn paperpytorch★ 2

Abstract

In this paper, we investigate the problem of vision and language navigation. To solve this problem, grounding the landmarks and spatial relations in the textual instructions into visual modality is important. We propose a neural agent named Explicit Object Relation Alignment Agent (EXOR),to explicitly align the spatial information in both instruction and the visual environment, including landmarks and spatial relationships between the agent and landmarks.Empirically, our proposed method surpasses the baseline by a large margin on the R2R dataset. We provide a comprehensive analysis to show our model’s spatial reasoning ability and explainability.

Tasks

Object Relation Spatial Reasoning Vision and Language Navigation

Explicit Object Relation Alignment for Vision and Language Navigation

Code

Abstract

Tasks

Reproductions