Neural Aggregation Network for Video Face Recognition

2016-03-17CVPR 2017Unverified0· sign in to hype

Jiaolong Yang, Peiran Ren, Dong-Qing Zhang, Dong Chen, Fang Wen, Hongdong Li, Gang Hua

Unverified — Be the first to reproduce this paper.

Abstract

This paper presents a Neural Aggregation Network (NAN) for video face recognition. The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition. The whole network is composed of two modules. The feature embedding module is a deep Convolutional Neural Network (CNN) which maps each face image to a feature vector. The aggregation module consists of two attention blocks which adaptively aggregate the feature vectors to form a single feature inside the convex hull spanned by them. Due to the attention mechanism, the aggregation is invariant to the image order. Our NAN is trained with a standard classification or verification loss without any extra supervision signal, and we found that it automatically learns to advocate high-quality face images while repelling low-quality ones such as blurred, occluded and improperly exposed faces. The experiments on IJB-A, YouTube Face, Celebrity-1000 video face recognition benchmarks show that it consistently outperforms naive aggregation methods and achieves the state-of-the-art accuracy.

Tasks

Face Identification Face Recognition Face Verification

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
BTS3.1	NAN (Adaface)	TAR @ FAR=0.01	0.54	—	Unverified
BTS3.1	MCN (Arcface)	TAR @ FAR=0.01	0.39	—	Unverified
BTS3.1	NAN (Arcface)	TAR @ FAR=0.01	0.39	—	Unverified
IJB-A	NAN	TAR @ FAR=0.01	94.1	—	Unverified

Neural Aggregation Network for Video Face Recognition

Abstract

Tasks

Benchmark Results

Reproductions