Understanding Convolution for Semantic Segmentation
Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, Garrison Cottrell
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/TuSimple/TuSimple-DUCOfficialIn papermxnet★ 0
- github.com/leemathew1998/GradientWeightpytorch★ 0
- github.com/leemathew1998/RGpytorch★ 0
- github.com/modelhub-ai/duc-semanticmxnet★ 0
- github.com/y-ouali/pytorch_segmentationpytorch★ 0
Abstract
Recent advances in deep learning, especially deep convolutional neural networks (CNNs), have led to significant improvement over previous semantic segmentation systems. Here we show how to improve pixel-wise semantic segmentation by manipulating convolution-related operations that are of both theoretical and practical value. First, we design dense upsampling convolution (DUC) to generate pixel-level prediction, which is able to capture and decode more detailed information that is generally missing in bilinear upsampling. Second, we propose a hybrid dilated convolution (HDC) framework in the encoding phase. This framework 1) effectively enlarges the receptive fields (RF) of the network to aggregate global information; 2) alleviates what we call the "gridding issue" caused by the standard dilated convolution operation. We evaluate our approaches thoroughly on the Cityscapes dataset, and achieve a state-of-art result of 80.1% mIOU in the test set at the time of submission. We also have achieved state-of-the-art overall on the KITTI road estimation benchmark and the PASCAL VOC2012 segmentation task. Our source code can be found at https://github.com/TuSimple/TuSimple-DUC .
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Cityscapes test | DUC-HDC (ResNet-101) | Mean IoU (class) | 77.6 | — | Unverified |
| PASCAL VOC 2012 test | TuSimple | Mean IoU | 83.1 | — | Unverified |