SOTAVerified

Human-like Relational Models for Activity Recognition in Video

2021-07-12Unverified0· sign in to hype

Joseph Chrol-Cannon, Andrew Gilbert, Ranko Lazic, Adithya Madhusoodanan, Frank Guerin

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Video activity recognition by deep neural networks is impressive for many classes. However, it falls short of human performance, especially for challenging to discriminate activities. Humans differentiate these complex activities by recognising critical spatio-temporal relations among explicitly recognised objects and parts, for example, an object entering the aperture of a container. Deep neural networks can struggle to learn such critical relationships effectively. Therefore we propose a more human-like approach to activity recognition, which interprets a video in sequential temporal phases and extracts specific relationships among objects and hands in those phases. Random forest classifiers are learnt from these extracted relationships. We apply the method to a challenging subset of the something-something dataset and achieve a more robust performance against neural network baselines on challenging activities.

Tasks

Reproductions