De^2Gaze: Deformable and Decoupled Representation Learning for 3D Gaze Estimation

2025-01-01CVPR 2025Unverified0· sign in to hype

Yunfeng Xiao, Xiaowei Bai, Baojun Chen, Hao Su, Hao He, Liang Xie, Erwei Yin

Unverified — Be the first to reproduce this paper.

Abstract

3D Gaze estimation is a challenging task due to two main issues. First, existing methods focus on analyzing dense features (e.g., large pixel regions), which are sensitive to local noise (e.g., light spots, blurs) and result in increased computational complexity. Second, an eyeball model can correspond multiple gaze directions, and the entangled representation between gazes and models increases the learning difficulty. To address these issues, we propose De 2Gaze , a lightweight and accurate model-aware 3D gaze estimation method. In De 2 Gaze, we introduce two key innovations for deformable and decoupled representation learning. Specifically, first, we propose a deformable sparse attention mechanism that can adapt sparse sampling points to attention areas to avoid local noise influences. Second, we propose a spatial decoupling network with a dual-branch decoding architecture to disentangle invariant (e.g., eyeball radius, position) and variable (e.g., gaze, pupil, iris) features from the latent space. Compared to existing methods, De 2 Gaze requires fewer sparse features, and achieves faster convergence speed, lower computational complexity, and higher accuracy in 3D gaze estimation.Qualitative and quantitative experiments demonstrate that De 2 Gaze achieves state-of-the-art accuracy and high-quality semantic segmentation for 3D gaze estimation on the TEyeD dataset.

Tasks

Gaze Estimation Representation Learning Semantic Segmentation

De^2Gaze: Deformable and Decoupled Representation Learning for 3D Gaze Estimation

Abstract

Tasks

Reproductions