Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos

2015-07-21Code Available0· sign in to hype

Serena Yeung, Olga Russakovsky, Ning Jin, Mykhaylo Andriluka, Greg Mori, Li Fei-Fei

Code Available — Be the first to reproduce this paper.

Code

github.com/lauradhatt/Interesting-Reads
none★ 0

Abstract

Every moment counts in action recognition. A comprehensive understanding of human activity in video requires labeling every frame according to the actions occurring, placing multiple labels densely over a video sequence. To study this problem we extend the existing THUMOS dataset and introduce MultiTHUMOS, a new dataset of dense labels over unconstrained internet videos. Modeling multiple, dense labels benefits from temporal relations within and across classes. We define a novel variant of long short-term memory (LSTM) deep networks for modeling these temporal relations via multiple input and output connections. We show that this model improves action labeling accuracy and further enables deeper understanding tasks ranging from structured retrieval to action prediction.

Tasks

Action Recognition Retrieval Temporal Action Localization

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Multi-THUMOS	Two-stream + LSTM	mAP	28.1	—	Unverified
Multi-THUMOS	Two-stream	mAP	27.6	—	Unverified

Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos

Code

Abstract

Tasks

Benchmark Results

Reproductions