| Audio-Visual Speech Recognition based on Regulated Transformer and Spatio-Temporal Fusion Strategy for Driver Assistive Systems | May 9, 2024 | Audio-Visual Speech RecognitionLipreading | CodeCode Available | 0 | 5 |
| Combining Residual Networks with LSTMs for Lipreading | Mar 12, 2017 | LipreadingLip Reading | CodeCode Available | 0 | 5 |
| Deep word embeddings for visual speech recognition | Oct 30, 2017 | Lipreadingspeech-recognition | CodeCode Available | 0 | 5 |
| Evaluation of End-to-End Continuous Spanish Lipreading in Different Data Conditions | Feb 1, 2025 | Lipreadingspeech-recognition | CodeCode Available | 0 | 5 |
| Harnessing GANs for Zero-shot Learning of New Classes in Visual Speech Recognition | Jan 29, 2019 | speech-recognitionSpeech Recognition | CodeCode Available | 0 | 5 |
| LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild | Nov 21, 2023 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 0 | 5 |
| SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data | Aug 1, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 0 | 5 |
| LRS3-TED: a large-scale dataset for visual speech recognition | Sep 3, 2018 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 0 | 5 |
| LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild | Oct 16, 2018 | LipreadingLip Reading | CodeCode Available | 0 | 5 |
| Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation | Jan 7, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 0 | 5 |
| Recurrent Neural Network Transducer for Audio-Visual Speech Recognition | Nov 8, 2019 | Audio-Visual Speech RecognitionLipreading | CodeCode Available | 0 | 5 |
| Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language | May 20, 2025 | Multi-Task LearningSign Language Recognition | CodeCode Available | 0 | 5 |
| Deep Multimodal Representation Learning from Temporal Data | Apr 11, 2017 | Audio-Visual Speech RecognitionRepresentation Learning | —Unverified | 0 | 0 |
| Deep Multimodal Learning for Audio-Visual Speech Recognition | Jan 22, 2015 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 | 0 |
| Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning | May 23, 2023 | Metric Learningspeech-recognition | —Unverified | 0 | 0 |
| Interactive decoding of words from visual speech recognition models | Jul 1, 2021 | Positionspeech-recognition | —Unverified | 0 | 0 |
| Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition | Jun 5, 2019 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 | 0 |
| Is Lip Region-of-Interest Sufficient for Lipreading? | May 28, 2022 | LipreadingSelf-Supervised Learning | —Unverified | 0 | 0 |
| Deep Lip Reading: a comparison of models and an online application | Jun 15, 2018 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| JEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech Recognition | Mar 4, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Deep Learning for Visual Speech Analysis: A Survey | May 22, 2022 | Deep Learningspeech-recognition | —Unverified | 0 | 0 |
| Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands | Jul 6, 2022 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 | 0 |
| Deep Learning-based Spatio Temporal Facial Feature Visual Speech Recognition | Apr 30, 2023 | Deep LearningFace Recognition | —Unverified | 0 | 0 |
| Large-Scale Visual Speech Recognition | Jul 13, 2018 | DecoderLipreading | —Unverified | 0 | 0 |
| Large-vocabulary Audio-visual Speech Recognition in Noisy Environments | Sep 10, 2021 | Audio-Visual Speech RecognitionLipreading | —Unverified | 0 | 0 |
| Learn2Talk: 3D Talking Face Learns from 2D Talking Face | Apr 19, 2024 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 | 0 |
| DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module | Aug 31, 2024 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 | 0 |
| Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition | Feb 15, 2022 | Audio-Visual Speech RecognitionLipreading | —Unverified | 0 | 0 |
| Continuous Speech Recognition using EEG and Video | Dec 16, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Leveraging Large Language Models in Visual Speech Recognition: Model Scaling, Context-Aware Decoding, and Iterative Polishing | May 27, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning | Dec 10, 2022 | Audio-Visual Speech Recognitionreinforcement-learning | —Unverified | 0 | 0 |
| Leveraging Uni-Modal Self-Supervised Learning for Multimodal Audio-visual Speech Recognition | Nov 16, 2021 | Audio-Visual Speech RecognitionLanguage Modelling | —Unverified | 0 | 0 |
| Conformers are All You Need for Visual Speech Recognition | Feb 17, 2023 | AllLipreading | —Unverified | 0 | 0 |
| Lightweight Operations for Visual Speech Recognition | Feb 7, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping | Aug 11, 2023 | Lip Readingspeech-recognition | —Unverified | 0 | 0 |
| LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition | Jan 8, 2025 | Lip Readingspeech-recognition | —Unverified | 0 | 0 |
| Lip Graph Assisted Audio-Visual Speech Recognition Using Bidirectional Synchronous Fusion | Oct 25, 2020 | Audio-Visual Speech RecognitionLandmark-based Lipreading | —Unverified | 0 | 0 |
| Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models | Jun 5, 2022 | Knowledge DistillationLipreading | —Unverified | 0 | 0 |
| Lip Reading Sentences in the Wild | Nov 16, 2016 | LipreadingLip Reading | —Unverified | 0 | 0 |
| AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model | Aug 15, 2023 | Quantizationspeech-recognition | —Unverified | 0 | 0 |
| Comparison of Conventional Hybrid and CTC/Attention Decoders for Continuous Visual Speech Recognition | Feb 20, 2024 | Decoderspeech-recognition | —Unverified | 0 | 0 |
| Advances and Challenges in Deep Lip Reading | Oct 15, 2021 | Deep LearningLip Reading | —Unverified | 0 | 0 |
| Listening With Your Eyes: Towards a Practical Visual Speech Recognition System Using Deep Boltzmann Machines | Dec 1, 2015 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data | Dec 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Which phoneme-to-viseme maps best improve visual-only computer lip-reading? | Oct 3, 2017 | Lip Readingspeech-recognition | —Unverified | 0 | 0 |
| Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs | Mar 9, 2025 | Audio-Visual Speech RecognitionComputational Efficiency | —Unverified | 0 | 0 |
| LRWR: Large-Scale Benchmark for Lip Reading in Russian language | Sep 14, 2021 | LipreadingLip Reading | —Unverified | 0 | 0 |
| Manifold-Kernels Comparison in MKPLS for Visual Speech Recognition | Jan 22, 2016 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Combining Multiple Views for Visual Speech Recognition | Oct 19, 2017 | Sentencespeech-recognition | —Unverified | 0 | 0 |
| Cocktail-Party Audio-Visual Speech Recognition | Jun 2, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 | 0 |