| SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition | Jan 18, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023 | Jan 7, 2024 | Decoderspeech-recognition | CodeCode Available | 1 |
| Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation | Jan 7, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 0 |
| MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition | Jan 7, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data | Dec 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| The GUA-Speech System Description for CNVSRC Challenge 2023 | Dec 12, 2023 | DecoderLanguage Modeling | —Unverified | 0 |
| Do VSR Models Generalize Beyond LRS3? | Nov 23, 2023 | Lip Readingspeech-recognition | CodeCode Available | 1 |
| Analysis of Visual Features for Continuous Lipreading in Spanish | Nov 21, 2023 | Lipreadingspeech-recognition | —Unverified | 0 |
| Speaker-Adapted End-to-End Visual Speech Recognition for Continuous Spanish | Nov 21, 2023 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild | Nov 21, 2023 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 0 |
| End-to-End Lip Reading in Romanian with Cross-Lingual Domain Adaptation and Lateral Inhibition | Oct 7, 2023 | Domain AdaptationLip Reading | —Unverified | 0 |
| AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition | Sep 29, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction | Sep 15, 2023 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper | Sep 15, 2023 | Language Identificationspeech-recognition | CodeCode Available | 1 |
| Another Point of View on Visual Speech Recognition | Aug 20, 2023 | Landmark-based Lipreadingspeech-recognition | —Unverified | 0 |
| AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model | Aug 15, 2023 | Quantizationspeech-recognition | —Unverified | 0 |
| Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder | Aug 14, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping | Aug 11, 2023 | Lip Readingspeech-recognition | —Unverified | 0 |
| SparseVSR: Lightweight and Noise Robust Visual Speech Recognition | Jul 10, 2023 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition | Jun 18, 2023 | Audio-Visual Speech RecognitionRepresentation Learning | CodeCode Available | 1 |
| Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition | Jun 18, 2023 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| Automated Speaker Independent Visual Speech Recognition: A Comprehensive Survey | Jun 14, 2023 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment | Jun 10, 2023 | Audio-Visual Speech RecognitionLip Reading | CodeCode Available | 1 |
| MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information | Jun 4, 2023 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning | May 23, 2023 | Metric Learningspeech-recognition | —Unverified | 0 |
| Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization | May 18, 2023 | Audio-Visual Speech RecognitionPrompt Engineering | CodeCode Available | 1 |
| Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition | May 16, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| Multi-Temporal Lip-Audio Memory for Visual Speech Recognition | May 8, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Deep Learning-based Spatio Temporal Facial Feature Visual Speech Recognition | Apr 30, 2023 | Deep LearningFace Recognition | —Unverified | 0 |
| SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision | Mar 30, 2023 | Lip Readingspeech-recognition | —Unverified | 0 |
| Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels | Mar 25, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring | Mar 15, 2023 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| The NPU-ASLP System for Audio-Visual Speech Recognition in MISP 2022 Challenge | Mar 11, 2023 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition | Mar 9, 2023 | Lip ReadingMachine Translation | CodeCode Available | 1 |
| MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation | Mar 1, 2023 | Audio-Visual Speech RecognitionRobust Speech Recognition | CodeCode Available | 2 |
| Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video | Feb 27, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Conformers are All You Need for Visual Speech Recognition | Feb 17, 2023 | AllLipreading | —Unverified | 0 |
| Audio-Visual Speech and Gesture Recognition by Sensors of Mobile Devices | Feb 17, 2023 | Audio-Visual Speech RecognitionGesture Recognition | —Unverified | 0 |
| Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition | Feb 16, 2023 | Sentencespeech-recognition | —Unverified | 0 |
| AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations | Feb 10, 2023 | Audio-Visual Speech RecognitionSelf-Supervised Learning | —Unverified | 0 |
| A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset | Jan 21, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset | Jan 16, 2023 | Audio-Visual Speech RecognitionLip Reading | CodeCode Available | 1 |
| ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration | Jan 1, 2023 | Audio-Visual Speech RecognitionResynthesis | —Unverified | 0 |
| ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement | Dec 21, 2022 | Audio-Visual Speech RecognitionResynthesis | —Unverified | 0 |
| Jointly Learning Visual and Auditory Speech Representations from Raw Data | Dec 12, 2022 | Audio-Visual Speech RecognitionLipreading | CodeCode Available | 1 |
| Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning | Dec 10, 2022 | Audio-Visual Speech Recognitionreinforcement-learning | —Unverified | 0 |
| VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning | Nov 21, 2022 | Audio-Visual Speech RecognitionLanguage Modelling | —Unverified | 0 |
| Streaming Audio-Visual Speech Recognition with Alignment Regularization | Nov 3, 2022 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Visual Speech Recognition in a Driver Assistance System | Aug 29, 2022 | Data AugmentationLipreading | —Unverified | 0 |
| Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition | Jul 13, 2022 | Audio-Visual Speech RecognitionDecoder | CodeCode Available | 1 |