| AV Taris: Online Audio-Visual Speech Recognition | Dec 14, 2020 | Action DetectionActivity Detection | CodeCode Available | 1 |
| Learn an Effective Lip Reading Model without Pains | Nov 15, 2020 | LipreadingLip Reading | CodeCode Available | 1 |
| Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition | May 19, 2020 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition | Apr 17, 2020 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition | Mar 6, 2020 | LipreadingLip Reading | CodeCode Available | 1 |
| Deep Audio-Visual Speech Recognition | Sep 6, 2018 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Zero-shot keyword spotting for visual speech recognition in-the-wild | Jul 23, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis | Jul 8, 2025 | Automatic Speech RecognitionLip Reading | —Unverified | 0 |
| ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition | Jun 5, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| Cocktail-Party Audio-Visual Speech Recognition | Jun 2, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |