SOTAVerified

Visual Speech Recognition

Papers

Showing 111120 of 182 papers

TitleStatusHype
Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer0
Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis0
Multimodal Machine Learning: Integrating Language, Vision and Speech0
AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations0
Multi-Temporal Lip-Audio Memory for Visual Speech Recognition0
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition0
NaturalL2S: End-to-End High-quality Multispeaker Lip-to-Speech Synthesis with Differential Digital Signal Processing0
"Notic My Speech" -- Blending Speech Patterns With Multimedia0
Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading0
Automated Speaker Independent Visual Speech Recognition: A Comprehensive Survey0
Show:102550
← PrevPage 12 of 19Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)30.7Unverified
2CTC/AttentionWord Error Rate (WER)19.1Unverified
#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)22.6Unverified