SOTAVerified

Visual Speech Recognition

Papers

Showing 121130 of 182 papers

TitleStatusHype
Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition for Single and Multi-Person Video0
Recent Progress in the CUHK Dysarthric Speech Recognition System0
Leveraging Uni-Modal Self-Supervised Learning for Multimodal Audio-visual Speech Recognition0
Advances and Challenges in Deep Lip Reading0
Sub-word Level Lip Reading With Visual Attention0
Perception Point: Identifying Critical Learning Periods in Speech for Bilingual Networks0
Audio-Visual Speech Recognition is Worth 32328 Voxels0
LRWR: Large-Scale Benchmark for Lip Reading in Russian language0
Large-vocabulary Audio-visual Speech Recognition in Noisy Environments0
Spatio-Temporal Attention Mechanism and Knowledge Distillation for Lip Reading0
Show:102550
← PrevPage 13 of 19Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)30.7Unverified
2CTC/AttentionWord Error Rate (WER)19.1Unverified
#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)22.6Unverified