Multi-kernel learning of deep convolutional features for action recognition

2017-07-21Unverified0· sign in to hype

Biswa Sengupta, Yu Qian

Unverified — Be the first to reproduce this paper.

Abstract

Image understanding using deep convolutional network has reached human-level performance, yet a closely related problem of video understanding especially, action recognition has not reached the requisite level of maturity. We combine multi-kernels based support-vector-machines (SVM) with a multi-stream deep convolutional neural network to achieve close to state-of-the-art performance on a 51-class activity recognition problem (HMDB-51 dataset); this specific dataset has proved to be particularly challenging for deep neural networks due to the heterogeneity in camera viewpoints, video quality, etc. The resulting architecture is named pillar networks as each (very) deep neural network acts as a pillar for the hierarchical classifiers. In addition, we illustrate that hand-crafted features such as improved dense trajectories (iDT) and Multi-skip Feature Stacking (MIFS), as additional pillars, can further supplement the performance.

Tasks

Action Recognition Activity Recognition Temporal Action Localization Video Understanding

Multi-kernel learning of deep convolutional features for action recognition

Abstract

Tasks

Reproductions