SOTAVerified

Visual Speech Recognition

Papers

Showing 76100 of 182 papers

TitleStatusHype
Part-based Lipreading for Audio-Visual Speech Recognition0
Perception Point: Identifying Critical Learning Periods in Speech for Bilingual Networks0
Perfect match: Improved cross-modal embeddings for audio-visual synchronisation0
Preliminary Test of a Real-Time, Interactive Silent Speech Interface Based on Electromagnetic Articulograph0
Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition0
Quantitative Analysis of Audio-Visual Tasks: An Information-Theoretic Perspective0
Rate-Invariant Analysis of Trajectories on Riemannian Manifolds with Application in Visual Speech Recognition0
Recent Progress in the CUHK Dysarthric Speech Recognition System0
Recognition of Isolated Words using Zernike and MFCC features for Audio Visual Speech Recognition0
Resolution limits on visual speech recognition0
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement0
ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration0
JEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech Recognition0
3D Feature Pyramid Attention Module for Robust Visual Speech Recognition0
Adapter-Based Multi-Agent AVSR Extension for Pre-Trained ASR Models0
Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs0
Advances and Challenges in Deep Lip Reading0
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model0
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset0
Analysis of Visual Features for Continuous Lipreading in Spanish0
Another Point of View on Visual Speech Recognition0
ASR is all you need: cross-modal distillation for lip reading0
A three-dimensional approach to Visual Speech Recognition using Discrete Cosine Transforms0
Audio-visual Recognition of Overlapped speech for the LRS2 dataset0
Audio-Visual Speech and Gesture Recognition by Sensors of Mobile Devices0
Show:102550
← PrevPage 4 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)30.7Unverified
2CTC/AttentionWord Error Rate (WER)19.1Unverified
#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)22.6Unverified