| VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis | Jul 8, 2025 | Automatic Speech RecognitionLip Reading | —Unverified | 0 |
| ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition | Jun 5, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| Cocktail-Party Audio-Visual Speech Recognition | Jun 2, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 |
| CNVSRC 2024: The Second Chinese Continuous Visual Speech Recognition Challenge | May 27, 2025 | Diversityspeech-recognition | —Unverified | 0 |
| Leveraging Large Language Models in Visual Speech Recognition: Model Scaling, Context-Aware Decoding, and Iterative Polishing | May 27, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language | May 20, 2025 | Multi-Task LearningSign Language Recognition | CodeCode Available | 0 |
| Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach | May 20, 2025 | Audio-Visual Speech RecognitionMixture-of-Experts | —Unverified | 0 |
| The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition | May 20, 2025 | Audio-Visual Speech Recognitionspeaker-diarization | —Unverified | 0 |
| SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer | May 7, 2025 | Audio-Visual Speech RecognitionLip Reading | —Unverified | 0 |
| CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization | May 6, 2025 | Active Speaker DetectionAudio-Visual Speech Recognition | CodeCode Available | 2 |