SOTAVerified

Visual Speech Recognition

Papers

Showing 2130 of 182 papers

TitleStatusHype
The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023Code1
Do VSR Models Generalize Beyond LRS3?Code1
Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from WhisperCode1
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion EncoderCode1
MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech RecognitionCode1
Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech RecognitionCode1
OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality AlignmentCode1
MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth InformationCode1
Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task GeneralizationCode1
Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech RecognitionCode1
Show:102550
← PrevPage 3 of 19Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)30.7Unverified
2CTC/AttentionWord Error Rate (WER)19.1Unverified
#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)22.6Unverified