SOTAVerified

Visual Speech Recognition

Papers

Showing 5160 of 182 papers

TitleStatusHype
Leveraging Large Language Models in Visual Speech Recognition: Model Scaling, Context-Aware Decoding, and Iterative Polishing0
CNVSRC 2024: The Second Chinese Continuous Visual Speech Recognition Challenge0
Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach0
Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign LanguageCode0
The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition0
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer0
Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides0
Visual-Aware Speech Recognition for Noisy Scenarios0
Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs0
NaturalL2S: End-to-End High-quality Multispeaker Lip-to-Speech Synthesis with Differential Digital Signal Processing0
Show:102550
← PrevPage 6 of 19Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)30.7Unverified
2CTC/AttentionWord Error Rate (WER)19.1Unverified
#ModelMetricClaimedVerifiedStatus
1VTP with more dataWord Error Rate (WER)22.6Unverified