SOTAVerified

Visual Speech Recognition

Papers

Showing 101150 of 182 papers

TitleStatusHype
Audio-Visual Speech Recognition is Worth 32328 Voxels0
Audio Visual Speech Recognition using Deep Recurrent Neural Networks0
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture0
Automated Speaker Independent Visual Speech Recognition: A Comprehensive Survey0
Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading0
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition0
AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations0
Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis0
Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides0
CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition Challenge0
CNVSRC 2024: The Second Chinese Continuous Visual Speech Recognition Challenge0
Cocktail-Party Audio-Visual Speech Recognition0
Combining Multiple Views for Visual Speech Recognition0
Comparison of Conventional Hybrid and CTC/Attention Decoders for Continuous Visual Speech Recognition0
Conformers are All You Need for Visual Speech Recognition0
Continuous Speech Recognition using EEG and Video0
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module0
Deep Learning-based Spatio Temporal Facial Feature Visual Speech Recognition0
Deep Learning for Visual Speech Analysis: A Survey0
Deep Lip Reading: a comparison of models and an online application0
Deep Multimodal Learning for Audio-Visual Speech Recognition0
Deep Multimodal Representation Learning from Temporal Data0
Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video0
Detecting Adversarial Attacks On Audiovisual Speech Recognition0
End-to-End Lip Reading in Romanian with Cross-Lingual Domain Adaptation and Lateral Inhibition0
End-to-End Visual Speech Recognition for Small-Scale Datasets0
End-To-End Visual Speech Recognition With LSTMs0
Enhancing CTC-Based Visual Speech Recognition0
Fusing information streams in end-to-end audio-visual speech recognition0
Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning0
Interactive decoding of words from visual speech recognition models0
Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition0
Is Lip Region-of-Interest Sufficient for Lipreading?0
Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands0
Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition for Single and Multi-Person Video0
Uncovering the Visual Contribution in Audio-Visual Speech Recognition0
VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning0
ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition0
Video-Based Action Recognition Using Rate-Invariant Analysis of Covariance Trajectories0
Visual-Aware Speech Recognition for Noisy Scenarios0
Visual-Only Recognition of Normal, Whispered and Silent Speech0
VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis0
Visual Speech Recognition0
Visual speech recognition: aligning terminologies for better understanding0
Visual Speech Recognition in a Driver Assistance System0
Visual Speech Recognition Using PCA Networks and LSTMs in a Tandem GMM-HMM System0
Visual Words for Automatic Lip-Reading0
Which phoneme-to-viseme maps best improve visual-only computer lip-reading?0
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception0
RUSAVIC Corpus: Russian Audio-Visual Speech in Cars0
Show:102550
← PrevPage 3 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)30.7Unverified
2CTC/AttentionWord Error Rate (WER)19.1Unverified
#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)22.6Unverified