Attention on Attention: Architectures for Visual Question Answering (VQA)

2018-03-21Code Available0· sign in to hype

Jasdeep Singh, Vincent Ying, Alex Nutkiewicz

Code Available — Be the first to reproduce this paper.

Code

github.com/SinghJasdeep/Attention-on-Attention-for-VQA
OfficialIn paperpytorch★ 0
github.com/feifengwhu/question_attention
pytorch★ 1
github.com/VincentYing/Attention-on-Attention-for-VQA
pytorch★ 0

Abstract

Visual Question Answering (VQA) is an increasingly popular topic in deep learning research, requiring coordination of natural language processing and computer vision modules into a single architecture. We build upon the model which placed first in the VQA Challenge by developing thirteen new attention mechanisms and introducing a simplified classifier. We performed 300 GPU hours of extensive hyperparameter and architecture searches and were able to achieve an evaluation score of 64.78%, outperforming the existing state-of-the-art single model's validation score of 63.15%.

Tasks

GPU Question Answering Visual Question Answering Visual Question Answering (VQA)

Attention on Attention: Architectures for Visual Question Answering (VQA)

Code

Abstract

Tasks

Reproductions