| MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition | Jun 18, 2023 | Audio-Visual Speech RecognitionRepresentation Learning | CodeCode Available | 1 |
| OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment | Jun 10, 2023 | Audio-Visual Speech RecognitionLip Reading | CodeCode Available | 1 |
| MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information | Jun 4, 2023 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization | May 18, 2023 | Audio-Visual Speech RecognitionPrompt Engineering | CodeCode Available | 1 |
| Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition | May 16, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring | Mar 15, 2023 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition | Mar 9, 2023 | Lip ReadingMachine Translation | CodeCode Available | 1 |
| OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset | Jan 16, 2023 | Audio-Visual Speech RecognitionLip Reading | CodeCode Available | 1 |
| Jointly Learning Visual and Auditory Speech Representations from Raw Data | Dec 12, 2022 | Audio-Visual Speech RecognitionLipreading | CodeCode Available | 1 |
| Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition | Jul 13, 2022 | Audio-Visual Speech RecognitionDecoder | CodeCode Available | 1 |
| CI-AVSR: A Cantonese Audio-Visual Speech Datasetfor In-car Command Recognition | Jun 1, 2022 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition | Feb 24, 2022 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition | Jan 11, 2022 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| End-to-end Audio-visual Speech Recognition with Conformers | Feb 12, 2021 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection | Dec 14, 2020 | DeepFake DetectionLipreading | CodeCode Available | 1 |
| AV Taris: Online Audio-Visual Speech Recognition | Dec 14, 2020 | Action DetectionActivity Detection | CodeCode Available | 1 |
| Learn an Effective Lip Reading Model without Pains | Nov 15, 2020 | LipreadingLip Reading | CodeCode Available | 1 |
| Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition | May 19, 2020 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition | Apr 17, 2020 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition | Mar 6, 2020 | LipreadingLip Reading | CodeCode Available | 1 |
| Deep Audio-Visual Speech Recognition | Sep 6, 2018 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Zero-shot keyword spotting for visual speech recognition in-the-wild | Jul 23, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis | Jul 8, 2025 | Automatic Speech RecognitionLip Reading | —Unverified | 0 |
| ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition | Jun 5, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| Cocktail-Party Audio-Visual Speech Recognition | Jun 2, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |