SOTAVerified

Temporal Action Detection with Structured Segment Networks

2017-04-20ICCV 2017Code Available2· sign in to hype

Yue Zhao, Yuanjun Xiong, Li-Min Wang, Zhirong Wu, Xiaoou Tang, Dahua Lin

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Detecting actions in untrimmed videos is an important yet challenging task. In this paper, we present the structured segment network (SSN), a novel framework which models the temporal structure of each action instance via a structured temporal pyramid. On top of the pyramid, we further introduce a decomposed discriminative model comprising two classifiers, respectively for classifying actions and determining completeness. This allows the framework to effectively distinguish positive proposals from background or incomplete ones, thus leading to both accurate recognition and localization. These components are integrated into a unified network that can be efficiently trained in an end-to-end fashion. Additionally, a simple yet effective temporal action proposal scheme, dubbed temporal actionness grouping (TAG) is devised to generate high quality action proposals. On two challenging benchmarks, THUMOS14 and ActivityNet, our method remarkably outperforms previous state-of-the-art methods, demonstrating superior accuracy and strong adaptivity in handling actions with various temporal structures.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
THUMOS14SSNmAP@0.529.8Unverified

Reproductions