Temporally Distributed Networks for Fast Video Semantic Segmentation

2020-04-03CVPR 2020Code Available1· sign in to hype

Ping Hu, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Stan Sclaroff, Federico Perazzi

Code Available — Be the first to reproduce this paper.

Code

github.com/feinanshan/TDNet
pytorch★ 206

Abstract

We present TDNet, a temporally distributed network designed for fast and accurate video semantic segmentation. We observe that features extracted from a certain high-level layer of a deep CNN can be approximated by composing features extracted from several shallower sub-networks. Leveraging the inherent temporal continuity in videos, we distribute these sub-networks over sequential frames. Therefore, at each time step, we only need to perform a lightweight computation to extract a sub-features group from a single sub-network. The full features used for segmentation are then recomposed by application of a novel attention propagation module that compensates for geometry deformation between frames. A grouped knowledge distillation loss is also introduced to further improve the representation power at both full and sub-feature levels. Experiments on Cityscapes, CamVid, and NYUD-v2 demonstrate that our method achieves state-of-the-art accuracy with significantly faster speed and lower latency.

Tasks

Knowledge Distillation Real-Time Semantic Segmentation Segmentation Semantic Segmentation Video Semantic Segmentation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
NYU-Depth V2	TD2-PSP50	Mean IoU	43.5	—	Unverified
NYU-Depth V2	TD4-PSP18	Mean IoU	37.4	—	Unverified
UrbanLF	TDNet (ResNet-50)	mIoU (Syn)	74.71	—	Unverified

Temporally Distributed Networks for Fast Video Semantic Segmentation

Code

Abstract

Tasks

Benchmark Results

Reproductions