Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues

2018-05-08Code Available0· sign in to hype

Cristina Palmero, Javier Selva, Mohammad Ali Bagheri, Sergio Escalera

Code Available — Be the first to reproduce this paper.

Code

github.com/crisie/RecurrentGaze
OfficialIn papertf★ 0
github.com/crisie/CRNN-Gaze
tf★ 0
github.com/code-implementation1/Code9/tree/main/CRNN
mindspore★ 0

Abstract

Gaze behavior is an important non-verbal cue in social signal processing and human-computer interaction. In this paper, we tackle the problem of person- and head pose-independent 3D gaze estimation from remote cameras, using a multi-modal recurrent convolutional neural network (CNN). We propose to combine face, eyes region, and face landmarks as individual streams in a CNN to estimate gaze in still images. Then, we exploit the dynamic nature of gaze by feeding the learned features of all the frames in a sequence to a many-to-one recurrent module that predicts the 3D gaze vector of the last frame. Our multi-modal static solution is evaluated on a wide range of head poses and gaze directions, achieving a significant improvement of 14.6% over the state of the art on EYEDIAP dataset, further improved by 4% when the temporal modality is included.

Tasks

Gaze Estimation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
EYEDIAP (floating target)	RecurrentGaze (Temporal)	Angular Error	5.19	—	Unverified
EYEDIAP (floating target)	RecurrentGaze (Static)	Angular Error	5.43	—	Unverified
EYEDIAP (screen target)	RecurrentGaze (Static)	Angular Error	3.38	—	Unverified
EYEDIAP (screen target)	RecurrentGaze (Temporal)	Angular Error	3.4	—	Unverified

Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues

Code

Abstract

Tasks

Benchmark Results

Reproductions