SOTAVerified

Temporal Action Proposal Generation With Action Frequency Adaptive Network

2023-06-23journal 2023Code Available0· sign in to hype

Yepeng Tang; Weining Wang; Chunjie Zhang; Jing Liu; Yao Zhao

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

As the cornerstone of human-behavior analysis in video understanding, temporal action proposal generation aims to predict the starting and ending time of human action instances in untrimmed videos. Although large achievements in temporal action proposal generation have been achieved, most previous studies ignore the variability of action frequency in raw videos, leading to unsatisfying performances on high-action-frequency videos. In fact, there exists two main issues which should be well addressed: data imbalance between high and low action-frequency videos, and inferior detection of short actions in high-action-frequency videos. To address the above issues, we propose an effective framework by adapting to the variability of action frequency, namely Action Frequency Adaptive Network (AFAN), which can be flexibly built upon any temporal action proposal generation method. AFAN consists of two modules: Learning From Experts (LFE) and Fine-Grained Processing (FGP). The LFE first trains a series of action proposal generators on different subsets of imbalanced data as experts and then teaches a unified student model via knowledge distillation. To better detect short actions, FGP first finds out high-action-frequency videos and then performs fine-grained detection. Extensive experimental results on four benchmark datasets (ActivityNet-1.3, HACS, THUMOS14 and FineAction) demonstrate the effectiveness and generalizability of the proposed AFAN, especially for high-action-frequency videos.

Tasks

Reproductions