SOTAVerified

Context Matters: Self-Attention for Sign Language Recognition

2021-01-12Code Available1· sign in to hype

Fares Ben Slimane, Mohamed Bouguessa

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

This paper proposes an attentional network for the task of Continuous Sign Language Recognition. The proposed approach exploits co-independent streams of data to model the sign language modalities. These different channels of information can share a complex temporal structure between each other. For that reason, we apply attention to synchronize and help capture entangled dependencies between the different sign language components. Even though Sign Language is multi-channel, handshapes represent the central entities in sign interpretation. Seeing handshapes in their correct context defines the meaning of a sign. Taking that into account, we utilize the attention mechanism to efficiently aggregate the hand features with their appropriate spatio-temporal context for better sign recognition. We found that by doing so the model is able to identify the essential Sign Language components that revolve around the dominant hand and the face areas. We test our model on the benchmark dataset RWTH-PHOENIX-Weather 2014, yielding competitive results.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
RWTH-PHOENIX-Weather 2014SANWord Error Rate (WER)29.7Unverified

Reproductions