Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

2020-03-30CVPR 2020Code Available1· sign in to hype

Qibin Hou, Li Zhang, Ming-Ming Cheng, Jiashi Feng

Code Available — Be the first to reproduce this paper.

Code

github.com/Andrew-Qibin/SPNet
OfficialIn paperpytorch★ 411
github.com/MaybeShewill-CV/sfnet-tensorflow
tf★ 0

Abstract

Spatial pooling has been proven highly effective in capturing long-range contextual information for pixel-wise prediction tasks, such as scene parsing. In this paper, beyond conventional spatial pooling that usually has a regular shape of NxN, we rethink the formulation of spatial pooling by introducing a new pooling strategy, called strip pooling, which considers a long but narrow kernel, i.e., 1xN or Nx1. Based on strip pooling, we further investigate spatial pooling architecture design by 1) introducing a new strip pooling module that enables backbone networks to efficiently model long-range dependencies, 2) presenting a novel building block with diverse spatial pooling as a core, and 3) systematically comparing the performance of the proposed strip pooling and conventional spatial pooling techniques. Both novel pooling-based designs are lightweight and can serve as an efficient plug-and-play module in existing scene parsing networks. Extensive experiments on popular benchmarks (e.g., ADE20K and Cityscapes) demonstrate that our simple approach establishes new state-of-the-art results. Code is made available at https://github.com/Andrew-Qibin/SPNet.

Tasks

Scene Parsing Semantic Segmentation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
ADE20K	SPNet (ResNet-101)	Validation mIoU	45.6	—	Unverified
Cityscapes test	SPNet (ResNet-101)	Mean IoU (class)	82	—	Unverified

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

Code

Abstract

Tasks

Benchmark Results

Reproductions