MetaVD: A Meta Video Dataset for enhancing human action recognition datasets

2021-11-01Computer Vision and Image Understanding 2021Code Available0· sign in to hype

Yuya Yoshikawa, Yutaro Shigeto, Akikazu Takeuchi

Code Available — Be the first to reproduce this paper.

Code

github.com/STAIR-Lab-CIT/metavd
none★ 4

Abstract

Numerous practical datasets have been developed to recognize human actions from videos. However, many of them were constructed by collecting videos within a limited domain; thus, a model trained using one of the existing datasets often fails to classify videos in a different domain accurately. A possible solution for this drawback is to enhance the domain of each action label, i.e., to import videos associated with a given action label from the other datasets, and then, to train a model using the enhanced dataset. To realize this solution, we constructed a meta video dataset from the existing datasets for human action recognition, referred to as MetaVD. MetaVD comprises six popular human action recognition datasets, which we integrated by annotating 568,015 relation labels in total. These relation labels reflect equality, similarity, and hierarchy between action labels of the original datasets. We further present simple yet effective dataset enhancement methods using MetaVD, which are useful for training models with higher generalization performance, as established by experiments on human action classification. As a further contribution of MetaVD, we show that its analysis can provide useful insight into the datasets.

Tasks

Action Classification Action Recognition Relation Temporal Action Localization Video Classification

MetaVD: A Meta Video Dataset for enhancing human action recognition datasets

Code

Abstract

Tasks

Reproductions