| VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning | Nov 21, 2022 | Audio-Visual Speech RecognitionLanguage Modelling | —Unverified | 0 | 0 |
| ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition | Jun 5, 2025 | Audio-Visual Speech Recognitionspeech-recognition | —Unverified | 0 | 0 |
| Video-Based Action Recognition Using Rate-Invariant Analysis of Covariance Trajectories | Mar 23, 2015 | Action RecognitionGeneral Classification | —Unverified | 0 | 0 |
| Visual-Aware Speech Recognition for Noisy Scenarios | Apr 9, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 | 0 |
| ASR is all you need: cross-modal distillation for lip reading | Nov 28, 2019 | AllAutomatic Speech Recognition | —Unverified | 0 | 0 |
| Visual-Only Recognition of Normal, Whispered and Silent Speech | Feb 18, 2018 | Silent Speech Recognitionspeech-recognition | —Unverified | 0 | 0 |
| VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis | Jul 8, 2025 | Automatic Speech RecognitionLip Reading | —Unverified | 0 | 0 |
| Visual Speech Recognition | Sep 3, 2014 | Audio-Visual Speech RecognitionLip Reading | —Unverified | 0 | 0 |
| Visual speech recognition: aligning terminologies for better understanding | Oct 3, 2017 | Lipreadingspeech-recognition | —Unverified | 0 | 0 |
| Another Point of View on Visual Speech Recognition | Aug 20, 2023 | Landmark-based Lipreadingspeech-recognition | —Unverified | 0 | 0 |