| MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages | Oct 1, 2024 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 2 |
| Robust Self-Supervised Audio-Visual Speech Recognition | Jan 5, 2022 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| Large Language Models are Strong Audio-Visual Speech Recognition Learners | Sep 18, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT | Oct 7, 2023 | Audio captioningAutomatic Speech Recognition | CodeCode Available | 2 |
| Fast Transformers with Clustered Attention | Jul 9, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography | Oct 26, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Dialectal Coverage And Generalization in Arabic Speech Recognition | Nov 7, 2024 | Arabic Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| An Embarrassingly Simple Approach for LLM with Strong ASR Capacity | Feb 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec | Sep 14, 2023 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 2 |
| Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction | Jan 5, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Paralinguistics-Aware Speech-Empowered Large Language Models for Natural Conversation | Feb 8, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts | Nov 2, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Adaptation of Whisper models to child speech recognition | Jul 24, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Distilling Knowledge from Ensembles of Acoustic Models for Joint CTC-Attention End-to-End Speech Recognition | May 19, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Framework for Curating Speech Datasets and Evaluating ASR Systems: A Case Study for Polish | Jul 18, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Distilling the Knowledge of BERT for Sequence-to-Sequence ASR | Aug 9, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Dompteur: Taming Audio Adversarial Examples | Feb 10, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition | Aug 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| DiaCorrect: Error Correction Back-end For Speaker Diarization | Sep 15, 2023 | Automatic Speech RecognitionDecoder | CodeCode Available | 1 |
| Distilling a Pretrained Language Model to a Multilingual ASR Model | Jun 25, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation | Nov 2, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities | May 23, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition | Oct 26, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CTC-synchronous Training for Monotonic Attention Model | May 10, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Cross-modal information fusion for voice spoofing detection | Feb 1, 2023 | Automatic Speech Recognitionfake voice detection | CodeCode Available | 1 |
| D4AM: A General Denoising Framework for Downstream Acoustic Models | Nov 28, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Deep Contextualized Acoustic Representations For Semi-Supervised Speech Recognition | Dec 3, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese | Oct 14, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Cross Attention Augmented Transducer Networks for Simultaneous Translation | Nov 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Continuous speech separation: dataset and analysis | Jan 30, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI | Dec 5, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models | Jul 5, 2024 | Adversarial AttackAutomatic Speech Recognition | CodeCode Available | 1 |
| Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition | May 16, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| Deep Sparse Conformer for Speech Recognition | Sep 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition | Mar 28, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition | Mar 2, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers | Apr 20, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Combining Frame-Synchronous and Label-Synchronous Systems for Speech Recognition | Jul 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Can Contextual Biasing Remain Effective with Whisper and GPT-2? | Jun 2, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation | Oct 24, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 |
| Can we use Common Voice to train a Multi-Speaker TTS system? | Oct 12, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CL-MASR: A Continual Learning Benchmark for Multilingual ASR | Oct 25, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications | Oct 12, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 |
| Brazilian Portuguese Speech Recognition Using Wav2vec 2.0 | Jul 23, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Adapting End-to-End Speech Recognition for Readable Subtitles | May 25, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CB-Conformer: Contextual biasing Conformer for biased word recognition | Apr 19, 2023 | Automatic Speech RecognitionLanguage Modeling | CodeCode Available | 1 |
| ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context | May 7, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech | Jun 16, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CopyNE: Better Contextual ASR by Copying Named Entities | May 22, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Common Voice: A Massively-Multilingual Speech Corpus | Dec 13, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |