Cross-Task Knowledge Transfer for Visually-Grounded Navigation

2019-05-01ICLR 2019Unverified0· sign in to hype

Devendra Singh Chaplot, Lisa Lee, Ruslan Salakhutdinov, Devi Parikh, Dhruv Batra

Unverified — Be the first to reproduce this paper.

Abstract

Recent efforts on training visual navigation agents conditioned on language using deep reinforcement learning have been successful in learning policies for two different tasks: learning to follow navigational instructions and embodied question answering. In this paper, we aim to learn a multitask model capable of jointly learning both tasks, and transferring knowledge of words and their grounding in visual objects across tasks. The proposed model uses a novel Dual-Attention unit to disentangle the knowledge of words in the textual representations and visual objects in the visual representations, and align them with each other. This disentangled task-invariant alignment of representations facilitates grounding and knowledge transfer across both tasks. We show that the proposed model outperforms a range of baselines on both tasks in simulated 3D environments. We also show that this disentanglement of representations makes our model modular, interpretable, and allows for zero-shot transfer to instructions containing new words by leveraging object detectors.

Tasks

Deep Reinforcement Learning Disentanglement Embodied Question Answering Question Answering Reinforcement Learning Transfer Learning Visual Navigation

Cross-Task Knowledge Transfer for Visually-Grounded Navigation

Abstract

Tasks

Reproductions