Motionlets: Mid-level 3D Parts for Human Motion Recognition

2013-06-01CVPR 2013Unverified0· sign in to hype

Li-Min Wang, Yu Qiao, Xiaoou Tang

Unverified — Be the first to reproduce this paper.

Abstract

This paper proposes motionlet, a mid-level and spatiotemporal part, for human motion recognition. Motionlet can be seen as a tight cluster in motion and appearance space, corresponding to the moving process of different body parts. We postulate three key properties of motionlet for action recognition: high motion saliency, multiple scale representation, and representative-discriminative ability. Towards this goal, we develop a data-driven approach to learn motionlets from training videos. First, we extract 3D regions with high motion saliency. Then we cluster these regions and preserve the centers as candidate templates for motionlet. Finally, we examine the representative and discriminative power of the candidates, and introduce a greedy method to select effective candidates. With motionlets, we present a mid-level representation for video, called motionlet activation vector. We conduct experiments on three datasets, KTH, HMDB51, and UCF50. The results show that the proposed methods significantly outperform state-of-the-art methods.

Tasks

Action Recognition Temporal Action Localization

Motionlets: Mid-level 3D Parts for Human Motion Recognition

Abstract

Tasks

Reproductions