Scene-Text Aware Image and Text Retrieval with Dual-Encoder

2022-05-01ACL 2022Unverified0· sign in to hype

Shumpei Miyawaki, Taku Hasegawa, Kyosuke Nishida, Takuma Kato, Jun Suzuki

Unverified — Be the first to reproduce this paper.

Abstract

We tackle the tasks of image and text retrieval using a dual-encoder model in which images and text are encoded independently. This model has attracted attention as an approach that enables efficient offline inferences by connecting both vision and language in the same semantic space; however, whether an image encoder as part of a dual-encoder model can interpret scene-text (i.e., the textual information in images) is unclear.We propose pre-training methods that encourage a joint understanding of the scene-text and surrounding visual information.The experimental results demonstrate that our methods improve the retrieval performances of the dual-encoder models.

Tasks

Retrieval Text Retrieval

Scene-Text Aware Image and Text Retrieval with Dual-Encoder

Abstract

Tasks

Reproductions