Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition

2016-12-01NeurIPS 2016Unverified0· sign in to hype

Jinzhuo Wang, Wenmin Wang, Xiongtao Chen, Ronggang Wang, Wen Gao

Unverified — Be the first to reproduce this paper.

Abstract

Contexts are crucial for action recognition in video. Current methods often mine contexts after extracting hierarchical local features and focus on their high-order encodings. This paper instead explores contexts as early as possible and leverages their evolutions for action recognition. In particular, we introduce a novel architecture called deep alternative neural network (DANN) stacking alternative layers. Each alternative layer consists of a volumetric convolutional layer followed by a recurrent layer. The former acts as local feature learner while the latter is used to collect contexts. Compared with feed-forward neural networks, DANN learns contexts of local features from the very beginning. This setting helps to preserve hierarchical context evolutions which we show are essential to recognize similar actions. Besides, we present an adaptive method to determine the temporal size for network input based on optical flow energy, and develop a volumetric pyramid pooling layer to deal with input clips of arbitrary sizes. We demonstrate the advantages of DANN on two benchmarks HMDB51 and UCF101 and report competitive or superior results to the state-of-the-art.

Tasks

Action Recognition Optical Flow Estimation Temporal Action Localization

Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition

Abstract

Tasks

Reproductions