SOTAVerified

RANet: Ranking Attention Network for Fast Video Object Segmentation

2019-08-19ICCV 2019Code Available0· sign in to hype

Ziqin Wang, Jun Xu, Li Liu, Fan Zhu, Ling Shao

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Despite online learning (OL) techniques have boosted the performance of semi-supervised video object segmentation (VOS) methods, the huge time costs of OL greatly restrict their practicality. Matching based and propagation based methods run at a faster speed by avoiding OL techniques. However, they are limited by sub-optimal accuracy, due to mismatching and drifting problems. In this paper, we develop a real-time yet very accurate Ranking Attention Network (RANet) for VOS. Specifically, to integrate the insights of matching based and propagation based methods, we employ an encoder-decoder framework to learn pixel-level similarity and segmentation in an end-to-end manner. To better utilize the similarity maps, we propose a novel ranking attention module, which automatically ranks and selects these maps for fine-grained VOS performance. Experiments on DAVIS-16 and DAVIS-17 datasets show that our RANet achieves the best speed-accuracy trade-off, e.g., with 33 milliseconds per frame and J&F=85.5% on DAVIS-16. With OL, our RANet reaches J&F=87.1% on DAVIS-16, exceeding state-of-the-art VOS methods. The code can be found at https://github.com/Storife/RANet.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
DAVIS 2016RANet+ (online learning)J&F87.1Unverified
DAVIS 2016RANetJ&F85.45Unverified
DAVIS-2017 (test-dev)RANetJ&F55.4Unverified
DAVIS 2017 (val)RANetJ&F65.7Unverified
DAVIS (no YouTube-VOS training)RANetD17 val (G)65.7Unverified

Reproductions