SOTAVerified

SlowFast Networks for Video Recognition

2018-12-10ICCV 2019Code Available1· sign in to hype

Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, Kaiming He

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We present SlowFast networks for video recognition. Our model involves (i) a Slow pathway, operating at low frame rate, to capture spatial semantics, and (ii) a Fast pathway, operating at high frame rate, to capture motion at fine temporal resolution. The Fast pathway can be made very lightweight by reducing its channel capacity, yet can learn useful temporal information for video recognition. Our models achieve strong performance for both action classification and detection in video, and large improvements are pin-pointed as contributions by our SlowFast concept. We report state-of-the-art accuracy on major video recognition benchmarks, Kinetics, Charades and AVA. Code has been made available at: https://github.com/facebookresearch/SlowFast

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
AVA v2.1SlowFast (Kinetics-400 pretraining)mAP (Val)26.3Unverified
AVA v2.1SlowFast++ (Kinetics-600 pretraining, NL)mAP (Val)28.3Unverified
AVA v2.1SlowFast (Kinetics-600 pretraining, NL)mAP (Val)27.3Unverified
AVA v2.1SlowFast (Kinetics-600 pretraining)mAP (Val)26.8Unverified
AVA v2.2SlowFast, 16x8 R101+NL (Kinetics-600 pretraining)mAP27.5Unverified
AVA v2.2SlowFast, 4x16, R50 (Kinetics-400 pretraining)mAP21.9Unverified
AVA v2.2SlowFast, 8x8, R101 (Kinetics-400 pretraining)mAP23.8Unverified
AVA v2.2SlowFast, 8x8 R101+NL (Kinetics-600 pretraining)mAP27.1Unverified
Diving-48SlowFastAccuracy77.6Unverified
H2O (2 Hands and Objects)SlowFastActions Top-177.69Unverified
Something-Something V2SlowFastTop-1 Accuracy61.7Unverified

Reproductions