SOTAVerified

Decoupling Features in Hierarchical Propagation for Video Object Segmentation

2022-10-18Code Available2· sign in to hype

Zongxin Yang, Yi Yang

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

This paper focuses on developing a more effective method of hierarchical propagation for semi-supervised Video Object Segmentation (VOS). Based on vision transformers, the recently-developed Associating Objects with Transformers (AOT) approach introduces hierarchical propagation into VOS and has shown promising results. The hierarchical propagation can gradually propagate information from past frames to the current frame and transfer the current frame feature from object-agnostic to object-specific. However, the increase of object-specific information will inevitably lead to the loss of object-agnostic visual information in deep propagation layers. To solve such a problem and further facilitate the learning of visual embeddings, this paper proposes a Decoupling Features in Hierarchical Propagation (DeAOT) approach. Firstly, DeAOT decouples the hierarchical propagation of object-agnostic and object-specific embeddings by handling them in two independent branches. Secondly, to compensate for the additional computation from dual-branch propagation, we propose an efficient module for constructing hierarchical propagation, i.e., Gated Propagation Module, which is carefully designed with single-head attention. Extensive experiments show that DeAOT significantly outperforms AOT in both accuracy and efficiency. On YouTube-VOS, DeAOT can achieve 86.0% at 22.4fps and 82.0% at 53.4fps. Without test-time augmentations, we achieve new state-of-the-art performance on four benchmarks, i.e., YouTube-VOS (86.2%), DAVIS 2017 (86.2%), DAVIS 2016 (92.9%), and VOT 2020 (0.622). Project page: https://github.com/z-x-yang/AOT.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
DAVIS 2016R50-DeAOT-LJ&F92.3Unverified
DAVIS 2016SwinB-DeAOT-LJ&F92.9Unverified
DAVIS 2016DeAOT-TJ&F88.9Unverified
DAVIS 2016DeAOT-SJ&F89.3Unverified
DAVIS 2016DeAOT-BJ&F91Unverified
DAVIS 2016DeAOT-LJ&F92Unverified
DAVIS-2017 (test-dev)SwinB-DeAOT-LJ&F82.8Unverified
DAVIS-2017 (test-dev)R50-DeAOT-LJ&F80.7Unverified
DAVIS-2017 (test-dev)DeAOT-LJ&F77.9Unverified
DAVIS-2017 (test-dev)DeAOT-BJ&F76.2Unverified
DAVIS-2017 (test-dev)DeAOT-SJ&F75.4Unverified
DAVIS-2017 (test-dev)DeAOT-TJ&F73.7Unverified
DAVIS 2017 (val)DeAOT-SJ&F80.8Unverified
DAVIS 2017 (val)R50-DeAOT-LJ&F85.2Unverified
DAVIS 2017 (val)DeAOT-TJ&F80.5Unverified
DAVIS 2017 (val)SwinB-DeAOT-LJ&F86.2Unverified
DAVIS 2017 (val)DeAOT-BJ&F82.2Unverified
DAVIS 2017 (val)DeAOT-LJ&F84.1Unverified
MOSEDeAOTJ&F59.4Unverified
VOT2020DeAOT-LEAO0.59Unverified
VOT2020DeAOT-BEAO0.57Unverified
VOT2020DeAOT-TEAO0.47Unverified
VOT2020SwinB-DeAOT-LEAO0.62Unverified
VOT2020R50-DeAOT-LEAO0.61Unverified
VOT2020DeAOT-SEAO0.59Unverified
YouTube-VOS 2018SwinB-DeAOT-LOverall86.2Unverified
YouTube-VOS 2018R50-DeAOT-LOverall86Unverified
YouTube-VOS 2018DeAOT-LOverall84.8Unverified
YouTube-VOS 2018DeAOT-BOverall84.6Unverified
YouTube-VOS 2018DeAOT-SOverall84Unverified
YouTube-VOS 2018DeAOT-TOverall82Unverified
YouTube-VOS 2019DeAOT-BOverall84.6Unverified
YouTube-VOS 2019DeAOT-TOverall82Unverified
YouTube-VOS 2019SwinB-DeAOT-LOverall86.1Unverified
YouTube-VOS 2019R50-DeAOT-LOverall85.9Unverified
YouTube-VOS 2019DeAOT-LOverall84.7Unverified
YouTube-VOS 2019DeAOT-SOverall83.8Unverified

Reproductions