SOTAVerified

Visual Speech Recognition

Papers

Showing 3140 of 182 papers

TitleStatusHype
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability ScoringCode1
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and RecognitionCode1
OLKAVS: An Open Large-Scale Korean Audio-Visual Speech DatasetCode1
Jointly Learning Visual and Auditory Speech Representations from Raw DataCode1
Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech RecognitionCode1
CI-AVSR: A Cantonese Audio-Visual Speech Datasetfor In-car Command RecognitionCode1
Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech RecognitionCode1
CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command RecognitionCode1
End-to-end Audio-visual Speech Recognition with ConformersCode1
AV Taris: Online Audio-Visual Speech RecognitionCode1
Show:102550
← PrevPage 4 of 19Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)30.7Unverified
2CTC/AttentionWord Error Rate (WER)19.1Unverified
#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)22.6Unverified