| Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis | May 1, 2012 | Audio-Visual Speech RecognitionSpeech Recognition | —Unverified | 0 |
| AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model | Aug 15, 2023 | Quantizationspeech-recognition | —Unverified | 0 |
| Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning | Dec 10, 2022 | Audio-Visual Speech Recognitionreinforcement-learning | —Unverified | 0 |
| End-to-End Lip Reading in Romanian with Cross-Lingual Domain Adaptation and Lateral Inhibition | Oct 7, 2023 | Domain AdaptationLip Reading | —Unverified | 0 |
| End-to-End Visual Speech Recognition for Small-Scale Datasets | Apr 2, 2019 | General Classificationspeech-recognition | —Unverified | 0 |
| End-To-End Visual Speech Recognition With LSTMs | Jan 20, 2017 | ClassificationGeneral Classification | —Unverified | 0 |
| Detecting Adversarial Attacks On Audiovisual Speech Recognition | Dec 18, 2019 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| Enhancing CTC-Based Visual Speech Recognition | Sep 11, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Fusing information streams in end-to-end audio-visual speech recognition | Apr 19, 2021 | Audio-Visual Speech RecognitionLip Reading | —Unverified | 0 |
| Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides | Apr 21, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations | Feb 10, 2023 | Audio-Visual Speech RecognitionSelf-Supervised Learning | —Unverified | 0 |
| Advances and Challenges in Deep Lip Reading | Oct 15, 2021 | Deep LearningLip Reading | —Unverified | 0 |
| AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition | Sep 29, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video | Feb 27, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| ASR is all you need: cross-modal distillation for lip reading | Nov 28, 2019 | AllAutomatic Speech Recognition | —Unverified | 0 |
| Deep Multimodal Representation Learning from Temporal Data | Apr 11, 2017 | Audio-Visual Speech RecognitionRepresentation Learning | —Unverified | 0 |
| Deep Multimodal Learning for Audio-Visual Speech Recognition | Jan 22, 2015 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading | Jan 16, 2017 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| 3D Feature Pyramid Attention Module for Robust Visual Speech Recognition | Oct 15, 2018 | LipreadingSentence | —Unverified | 0 |
| Leveraging Uni-Modal Self-Supervised Learning for Multimodal Audio-visual Speech Recognition | Nov 16, 2021 | Audio-Visual Speech RecognitionLanguage Modelling | —Unverified | 0 |
| Deep Lip Reading: a comparison of models and an online application | Jun 15, 2018 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Deep Learning for Visual Speech Analysis: A Survey | May 22, 2022 | Deep Learningspeech-recognition | —Unverified | 0 |
| Automated Speaker Independent Visual Speech Recognition: A Comprehensive Survey | Jun 14, 2023 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Deep Learning-based Spatio Temporal Facial Feature Visual Speech Recognition | Apr 30, 2023 | Deep LearningFace Recognition | —Unverified | 0 |
| Another Point of View on Visual Speech Recognition | Aug 20, 2023 | Landmark-based Lipreadingspeech-recognition | —Unverified | 0 |