SOTAVerified

ActionNet-VE Dataset: A Dataset for Describing Visual Events by Extending VIRAT Ground 2.0

2015-11-252015 8th International Conference on Signal Processing, Image Processing and Pattern Recognition (SIP) 2015Unverified0· sign in to hype

Jinyoung Moon, Yongjin Kwon, Kyuchang Kang, Jongyoul Park

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper introduces a dataset for recognizing and describing interactive events between objects of interest including persons, cars, bikes, and carried objects. Although there have been many video datasets for human activity recognition, most of them focus on persons and their actions and sometimes ignore the specific information on related objects, such as their object type and minimum bounding boxes, in annotations. ActionNet-VE dataset was designed to include full annotations on all objects and events of interest occurred in a video clip for describing the semantics of the event. The dataset adopt 75 video clips from VIRAT Ground 2.0, and extend annotations on the events and their related objects. In addition, the dataset describes semantics of each events by using elements of sentences, such as verb, subject, and objects.

Tasks

Reproductions