SOTAVerified

Detecting Human-Object Relationships in Videos

2021-01-01ICCV 2021Unverified0· sign in to hype

Jingwei Ji, Rishi Desai, Juan Carlos Niebles

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We study a crucial problem in video analysis: human-object relationship detection. The majority of previous approaches are developed only for the static image scenario, without incorporating the temporal dynamics so vital to contextualizing human-object relationships. We propose a model with Intra- and Inter-Transformers, enabling joint spatial and temporal reasoning on multiple visual concepts of objects, relationships, and human poses. We find that applying attention mechanisms among features distributed spatio-temporally greatly improves our understanding of human-object relationships. Our method is validated on two datasets, Action Genome and CAD-120-EVAR, and achieves state-of-the-art performance on both of them.

Tasks

Reproductions