Dilated SpineNet for Semantic Segmentation

2021-03-23Unverified0· sign in to hype

Abdullah Rashwan, Xianzhi Du, Xiaoqi Yin, Jing Li

Unverified — Be the first to reproduce this paper.

Abstract

Scale-permuted networks have shown promising results on object bounding box detection and instance segmentation. Scale permutation and cross-scale fusion of features enable the network to capture multi-scale semantics while preserving spatial resolution. In this work, we evaluate this meta-architecture design on semantic segmentation - another vision task that benefits from high spatial resolution and multi-scale feature fusion at different network stages. By further leveraging dilated convolution operations, we propose SpineNet-Seg, a network discovered by NAS that is searched from the DeepLabv3 system. SpineNet-Seg is designed with a better scale-permuted network topology with customized dilation ratios per block on a semantic segmentation task. SpineNet-Seg models outperform the DeepLabv3/v3+ baselines at all model scales on multiple popular benchmarks in speed and accuracy. In particular, our SpineNet-S143+ model achieves the new state-of-the-art on the popular Cityscapes benchmark at 83.04% mIoU and attained strong performance on the PASCAL VOC2012 benchmark at 85.56% mIoU. SpineNet-Seg models also show promising results on a challenging Street View segmentation dataset. Code and checkpoints will be open-sourced.

Tasks

Instance Segmentation Segmentation Semantic Segmentation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Cityscapes val	SpineNet-S143+ (single-scale test)	mIoU	83.04	—	Unverified
PASCAL VOC 2012 val	SpineNet-S143 (single-scale test)	mIoU	85.64	—	Unverified

Dilated SpineNet for Semantic Segmentation

Abstract

Tasks

Benchmark Results

Reproductions