| End-to-end Audio-visual Speech Recognition with Conformers | Feb 12, 2021 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Do VSR Models Generalize Beyond LRS3? | Nov 23, 2023 | Lip Readingspeech-recognition | CodeCode Available | 1 | 5 |
| Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition | Jun 18, 2023 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 | 5 |
| Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder | Aug 14, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 | 5 |
| MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition | Mar 9, 2023 | Lip ReadingMachine Translation | CodeCode Available | 1 | 5 |
| AV Taris: Online Audio-Visual Speech Recognition | Dec 14, 2020 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition | Feb 8, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 | 5 |
| Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition | Mar 6, 2020 | LipreadingLip Reading | CodeCode Available | 1 | 5 |
| CI-AVSR: A Cantonese Audio-Visual Speech Datasetfor In-car Command Recognition | Jun 1, 2022 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 | 5 |
| Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection | Dec 14, 2020 | DeepFake DetectionLipreading | CodeCode Available | 1 | 5 |