| Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder | Aug 14, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation | Aug 8, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus | Jul 29, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures | Jul 27, 2023 | Automatic Speech RecognitionContrastive Learning | CodeCode Available | 1 |
| Adaptation of Whisper models to child speech recognition | Jul 24, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| A Reference-less Quality Metric for Automatic Speech Recognition via Contrastive-Learning of a Multi-Language Model with Self-Supervision | Jun 21, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning | Jun 21, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Quilt-1M: One Million Image-Text Pairs for Histopathology | Jun 20, 2023 | Automatic Speech RecognitionCross-Modal Retrieval | CodeCode Available | 1 |
| Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation | Jun 15, 2023 | Automatic Speech RecognitionClustering | CodeCode Available | 1 |
| SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization | Jun 3, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |