| Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition | Feb 15, 2022 | Audio-Visual Speech RecognitionLipreading | —Unverified | 0 |
| Leveraging Large Language Models in Visual Speech Recognition: Model Scaling, Context-Aware Decoding, and Iterative Polishing | May 27, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning | Dec 10, 2022 | Audio-Visual Speech Recognitionreinforcement-learning | —Unverified | 0 |
| Leveraging Uni-Modal Self-Supervised Learning for Multimodal Audio-visual Speech Recognition | Nov 16, 2021 | Audio-Visual Speech RecognitionLanguage Modelling | —Unverified | 0 |
| Lightweight Operations for Visual Speech Recognition | Feb 7, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping | Aug 11, 2023 | Lip Readingspeech-recognition | —Unverified | 0 |
| LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition | Jan 8, 2025 | Lip Readingspeech-recognition | —Unverified | 0 |
| Lip Graph Assisted Audio-Visual Speech Recognition Using Bidirectional Synchronous Fusion | Oct 25, 2020 | Audio-Visual Speech RecognitionLandmark-based Lipreading | —Unverified | 0 |
| Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models | Jun 5, 2022 | Knowledge DistillationLipreading | —Unverified | 0 |
| Lip Reading Sentences in the Wild | Nov 16, 2016 | LipreadingLip Reading | —Unverified | 0 |
| Listening With Your Eyes: Towards a Practical Visual Speech Recognition System Using Deep Boltzmann Machines | Dec 1, 2015 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data | Dec 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| LRWR: Large-Scale Benchmark for Lip Reading in Russian language | Sep 14, 2021 | LipreadingLip Reading | —Unverified | 0 |
| Manifold-Kernels Comparison in MKPLS for Visual Speech Recognition | Jan 22, 2016 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| MKPLS: Manifold Kernel Partial Least Squares for Lipreading and Speaker Identification | Jun 1, 2013 | LipreadingSpeaker Identification | —Unverified | 0 |
| MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition | Jan 7, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| MobiVSR: A Visual Speech Recognition Solution for Mobile Devices | May 10, 2019 | Lip ReadingQuantization | —Unverified | 0 |
| Modality Attention for End-to-End Audio-visual Speech Recognition | Nov 13, 2018 | Audio-Visual Speech RecognitionRobust Speech Recognition | —Unverified | 0 |
| MoHAVE: Mixture of Hierarchical Audio-Visual Experts for Robust Speech Recognition | Feb 11, 2025 | Audio-Visual Speech RecognitionComputational Efficiency | —Unverified | 0 |
| MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization | Jun 25, 2024 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer | Mar 14, 2024 | Audio-Visual Speech RecognitionRobust Speech Recognition | —Unverified | 0 |
| Multimodal Machine Learning: Integrating Language, Vision and Speech | Jul 1, 2017 | Audio-Visual Speech RecognitionBIG-bench Machine Learning | —Unverified | 0 |
| Multi-Temporal Lip-Audio Memory for Visual Speech Recognition | May 8, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| NaturalL2S: End-to-End High-quality Multispeaker Lip-to-Speech Synthesis with Differential Digital Signal Processing | Feb 17, 2025 | Lip to Speech Synthesisspeech-recognition | —Unverified | 0 |
| "Notic My Speech" -- Blending Speech Patterns With Multimedia | Jun 12, 2020 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Part-based Lipreading for Audio-Visual Speech Recognition | Dec 14, 2020 | Audio-Visual Speech RecognitionLipreading | —Unverified | 0 |
| Perception Point: Identifying Critical Learning Periods in Speech for Bilingual Networks | Oct 13, 2021 | Lip Readingspeech-recognition | —Unverified | 0 |
| Perfect match: Improved cross-modal embeddings for audio-visual synchronisation | Sep 21, 2018 | Binary ClassificationCross-Modal Retrieval | —Unverified | 0 |
| Preliminary Test of a Real-Time, Interactive Silent Speech Interface Based on Electromagnetic Articulograph | Jun 1, 2014 | Speech RecognitionVisual Speech Recognition | —Unverified | 0 |
| Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition | Feb 16, 2023 | Sentencespeech-recognition | —Unverified | 0 |
| Quantitative Analysis of Audio-Visual Tasks: An Information-Theoretic Perspective | Sep 29, 2024 | Audio-Visual Speech RecognitionLip Reading | —Unverified | 0 |
| Rate-Invariant Analysis of Trajectories on Riemannian Manifolds with Application in Visual Speech Recognition | Jun 1, 2014 | Activity RecognitionClassification | —Unverified | 0 |
| Recent Progress in the CUHK Dysarthric Speech Recognition System | Jan 15, 2022 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Recognition of Isolated Words using Zernike and MFCC features for Audio Visual Speech Recognition | Jul 4, 2014 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Resolution limits on visual speech recognition | Oct 3, 2017 | Lip Readingspeech-recognition | —Unverified | 0 |
| ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement | Dec 21, 2022 | Audio-Visual Speech RecognitionResynthesis | —Unverified | 0 |
| ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration | Jan 1, 2023 | Audio-Visual Speech RecognitionResynthesis | —Unverified | 0 |
| JEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech Recognition | Mar 4, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| 3D Feature Pyramid Attention Module for Robust Visual Speech Recognition | Oct 15, 2018 | LipreadingSentence | —Unverified | 0 |
| Adapter-Based Multi-Agent AVSR Extension for Pre-Trained ASR Models | Feb 3, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs | Mar 9, 2025 | Audio-Visual Speech RecognitionComputational Efficiency | —Unverified | 0 |
| Advances and Challenges in Deep Lip Reading | Oct 15, 2021 | Deep LearningLip Reading | —Unverified | 0 |
| AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model | Aug 15, 2023 | Quantizationspeech-recognition | —Unverified | 0 |
| A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset | Jan 21, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Analysis of Visual Features for Continuous Lipreading in Spanish | Nov 21, 2023 | Lipreadingspeech-recognition | —Unverified | 0 |
| Another Point of View on Visual Speech Recognition | Aug 20, 2023 | Landmark-based Lipreadingspeech-recognition | —Unverified | 0 |
| ASR is all you need: cross-modal distillation for lip reading | Nov 28, 2019 | AllAutomatic Speech Recognition | —Unverified | 0 |
| A three-dimensional approach to Visual Speech Recognition using Discrete Cosine Transforms | Sep 7, 2016 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Audio-visual Recognition of Overlapped speech for the LRS2 dataset | Jan 6, 2020 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Audio-Visual Speech and Gesture Recognition by Sensors of Mobile Devices | Feb 17, 2023 | Audio-Visual Speech RecognitionGesture Recognition | —Unverified | 0 |