SOTAVerified

Object Detection in Videos by Short and Long Range Object Linking

2018-01-30IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAM) 2018Unverified0· sign in to hype

Peng Tang † Chunyu Wang ‡ Xinggang Wang † Wenyu Liu † Wenjun Zeng ‡ Jingdong Wang ‡ † School of EIC, Huazhong University of Science and Technology   ‡ Microsoft Research Asia

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We address the problem of detecting objects in videos with the interest in exploring temporal contexts. Our core idea is to link objects in the short and long ranges for improving the classification quality. Our approach first proposes a set of candidate spatio-temporal cuboids, each of which serves as a container associating the object across short range frames, for a short video segment. It then regresses the precise box locations in each frame over each cuboid proposal, yielding a tubelet with a single classification score which is aggregated from the scores of the boxes in the tubelet. Third, we extend the non-maximum suppression algorithm to remove spatially-overlapping tubelets in the short segment, avoiding tubelets broken by the frame-wise NMS. Finally, we link the tubelets across temporally-overlapping short segments over the whole video, in order to boost the classification scores for positive detections by aggregating the scores in the linked tubelets. Experiments on the ImageNet VID dataset shows that our approach achieves the state-of-the-art performance.

Tasks

Reproductions