Multi-scale Motion-Aware Module for Video Action Recognition

2023-02-19ECCV Workshops 2023Unverified0· sign in to hype

Huai-Wei Peng, Yu-Chee Tseng

Unverified — Be the first to reproduce this paper.

Abstract

Due to the lengthy computing time for optical flow, recent works have proposed to use the correlation operation as an alternative approach to extracting motion features. Although using correlation operations shows significant improvement with negligible FLOPs, it introduces much more latency per FLOP than convolution operations and increases noticeable latency as a larger searching patch is applied. Nonetheless, shrinking the searching patch in correlation operation is doomed to degrade its performance owing to the inability to capture larger displacements. In this paper, we propose an effective and low-latency Multi-Scale Motion-Aware (MSMA) module. It uses smaller searching patches at different scales for efficiently extracting motion features from large displacements. It can be installed into and generalizes well on different CNN backbones. When installed into TSM ResNet-50, the MSMA module introduces ≈ 17.6% more latency on NVIDIA Tesla V100 GPU, yet, it achieves state-of-the-art performance on SomethingSomething V1 & V2 and Diving-48.

Tasks

Action Recognition GPU Optical Flow Estimation Temporal Action Localization

Multi-scale Motion-Aware Module for Video Action Recognition

Abstract

Tasks

Reproductions