| Deep Audio-Visual Speech Recognition | Sep 6, 2018 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| AlignVSR: Audio-Visual Cross-Modal Alignment for Visual Speech Recognition | Oct 21, 2024 | cross-modal alignmentspeech-recognition | CodeCode Available | 1 | 5 |
| Do VSR Models Generalize Beyond LRS3? | Nov 23, 2023 | Lip Readingspeech-recognition | CodeCode Available | 1 | 5 |
| Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models | Feb 9, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 | 5 |
| CI-AVSR: A Cantonese Audio-Visual Speech Datasetfor In-car Command Recognition | Jun 1, 2022 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 | 5 |
| End-to-end Audio-visual Speech Recognition with Conformers | Feb 12, 2021 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition | May 16, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 | 5 |
| Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation | Jan 23, 2025 | Audio-Visual Speech RecognitionMulti-Task Learning | CodeCode Available | 1 | 5 |
| How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition | Apr 17, 2020 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 | 5 |
| Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition | Jun 18, 2023 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 | 5 |