| MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition | Jun 18, 2023 | Audio-Visual Speech RecognitionRepresentation Learning | CodeCode Available | 1 |
| Automated Speaker Independent Visual Speech Recognition: A Comprehensive Survey | Jun 14, 2023 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment | Jun 10, 2023 | Audio-Visual Speech RecognitionLip Reading | CodeCode Available | 1 |
| MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information | Jun 4, 2023 | Audio-Visual Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning | May 23, 2023 | Metric Learningspeech-recognition | —Unverified | 0 |
| Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization | May 18, 2023 | Audio-Visual Speech RecognitionPrompt Engineering | CodeCode Available | 1 |
| Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition | May 16, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| Multi-Temporal Lip-Audio Memory for Visual Speech Recognition | May 8, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Deep Learning-based Spatio Temporal Facial Feature Visual Speech Recognition | Apr 30, 2023 | Deep LearningFace Recognition | —Unverified | 0 |
| SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision | Mar 30, 2023 | Lip Readingspeech-recognition | —Unverified | 0 |