Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

2019-03-06CVPR 2019Code Available0· sign in to hype

Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/Kelym/FAST
OfficialIn paperpytorch★ 0

Abstract

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et. al. (2018). Given a natural language instruction and photo-realistic image views of a previously unseen environment, the agent was tasked with navigating from source to target location as quickly as possible. While all current approaches make local action decisions or score entire trajectories using beam search, ours balances local and global signals when exploring an unobserved environment. Importantly, this lets us act greedily but use global signals to backtrack when necessary. Applying FAST framework to existing state-of-the-art models achieved a 17% relative gain, an absolute 6% gain on Success rate weighted by Path Length (SPL).

Tasks

Vision and Language Navigation Vision-Language Navigation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Room2Room	Tactical Rewind - short	spl	0.41	—	Unverified
VLN Challenge	Tactical Rewind - long	success	0.61	—	Unverified
VLN Challenge	Tactical Rewind - short	success	0.54	—	Unverified

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Code

Abstract

Tasks

Benchmark Results

Reproductions