Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)

2019-06-25Unverified0· sign in to hype

Joon Son Chung

Unverified — Be the first to reproduce this paper.

Abstract

This report describes our submission to the ActivityNet Challenge at CVPR 2019. We use a 3D convolutional neural network (CNN) based front-end and an ensemble of temporal convolution and LSTM classifiers to predict whether a visible person is speaking or not. Our results show significant improvements over the baseline on the AVA-ActiveSpeaker dataset.

Tasks

Active Speaker Detection Audio-Visual Active Speaker Detection

Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)

Abstract

Tasks

Reproductions