Dynamic Memory Networks for Visual and Textual Question Answering

2016-03-04Code Available0· sign in to hype

Caiming Xiong, Stephen Merity, Richard Socher

Code Available — Be the first to reproduce this paper.

Code

github.com/dandelin/Dynamic-memory-networks-plus-Pytorch
pytorch★ 0
github.com/ajenningsfrankston/Dynamic-Memory-Network-Plus-master
tf★ 0
github.com/jxz542189/dmn_plus
tf★ 0
github.com/sy-sunmoon/Clever-Commenter-Let-s-Try-More-Apps
pytorch★ 0
github.com/ethancaballero/Improved-Dynamic-Memory-Networks-DMN-plus
none★ 0
github.com/DongjunLee/dmn-tensorflow
tf★ 0
github.com/edithal-14/DMN-Novelty
pytorch★ 0
github.com/vchudinov/dynamic_memory_networks_with_keras
tf★ 0
github.com/therne/dmn-tensorflow
tf★ 0
github.com/imatge-upc/vqa-2016-cvprw
tf★ 0

Abstract

Neural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering. One such architecture, the dynamic memory network (DMN), obtained high accuracy on a variety of language tasks. However, it was not shown whether the architecture achieves strong results for question answering when supporting facts are not marked during training or whether it could be applied to other modalities such as images. Based on an analysis of the DMN, we propose several improvements to its memory and input modules. Together with these changes we introduce a novel input module for images in order to be able to answer visual questions. Our new DMN+ model improves the state of the art on both the Visual Question Answering dataset and the -10k text question-answering dataset without supporting fact supervision.

Tasks

Question Answering Visual Question Answering Visual Question Answering (VQA)

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
COCO Visual Question Answering (VQA) real images 1.0 open ended	DMN+ [xiong2016dynamic]	Percentage correct	60.4	—	Unverified
VQA v1 test-dev	DMN+	Accuracy	60.3	—	Unverified
VQA v1 test-std	DMN+	Accuracy	60.4	—	Unverified

Dynamic Memory Networks for Visual and Textual Question Answering

Code

Abstract

Tasks

Benchmark Results

Reproductions