MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond

2020-04-24ICLR 2021Code Available0· sign in to hype

Duy-Kien Nguyen, Vedanuj Goswami, Xinlei Chen

Code Available — Be the first to reproduce this paper.

Code

github.com/facebookresearch/mmf/tree/master/projects/movie_mcan
Officialpytorch★ 0

Abstract

This paper focuses on visual counting, which aims to predict the number of occurrences given a natural image and a query (e.g. a question or a category). Unlike most prior works that use explicit, symbolic models which can be computationally expensive and limited in generalization, we propose a simple and effective alternative by revisiting modulated convolutions that fuse the query and the image locally. Following the design of residual bottleneck, we call our method MoVie, short for Modulated conVolutional bottlenecks. Notably, MoVie reasons implicitly and holistically and only needs a single forward-pass during inference. Nevertheless, MoVie showcases strong performance for counting: 1) advancing the state-of-the-art on counting-specific VQA tasks while being more efficient; 2) outperforming prior-art on difficult benchmarks like COCO for common object counting; 3) helped us secure the first place of 2020 VQA challenge when integrated as a module for 'number' related questions in generic VQA models. Finally, we show evidence that modulated convolutions such as MoVie can serve as a general mechanism for reasoning tasks beyond counting.

Tasks

Object Counting Question Answering Visual Question Answering (VQA)

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
HowMany-QA	MoVie-ResNeXt	Accuracy	64	—	Unverified
HowMany-QA	MoVie	Accuracy	61.2	—	Unverified
TallyQA-Complex	MoVie-ResNeXt	Accuracy	56.8	—	Unverified
TallyQA-Complex	MoVie	Accuracy	54.1	—	Unverified
TallyQA-Simple	MoVie-ResNeXt	Accuracy	74.9	—	Unverified
TallyQA-Simple	MoVie	Accuracy	70.8	—	Unverified

MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond

Code

Abstract

Tasks

Benchmark Results

Reproductions