| ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers | Apr 20, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Common Voice: A Massively-Multilingual Speech Corpus | Dec 13, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech | Jun 16, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation | Oct 24, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 |
| A context-aware knowledge transferring strategy for CTC-based ASR | Oct 12, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition | Feb 2, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications | Apr 30, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Brazilian Portuguese Speech Recognition Using Wav2vec 2.0 | Jul 23, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Can Contextual Biasing Remain Effective with Whisper and GPT-2? | Jun 2, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications | Oct 12, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 |
| Advancing Test-Time Adaptation in Wild Acoustic Test Settings | Oct 14, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Cross Attention Augmented Transducer Networks for Simultaneous Translation | Nov 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Accented Speech Recognition With Accent-specific Codebooks | Oct 24, 2023 | Accented Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| CTC-synchronous Training for Monotonic Attention Model | May 10, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities | May 23, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition | Oct 26, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Can we use Common Voice to train a Multi-Speaker TTS system? | Oct 12, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| DiaCorrect: Error Correction Back-end For Speaker Diarization | Sep 15, 2023 | Automatic Speech RecognitionDecoder | CodeCode Available | 1 |
| BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithm | Dec 11, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Back Translation for Speech-to-text Translation Without Transcripts | May 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| BembaSpeech: A Speech Recognition Corpus for the Bemba Language | Feb 9, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts | Nov 2, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Dompteur: Taming Audio Adversarial Examples | Feb 10, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation | Nov 2, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| AVLnet: Learning Audio-Visual Language Representations from Instructional Videos | Jun 16, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition | Aug 31, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features | Aug 3, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction | Mar 31, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| End-to-end Named Entity Recognition from English Speech | May 22, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation | Nov 1, 2023 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| AV Taris: Online Audio-Visual Speech Recognition | Dec 14, 2020 | Action DetectionActivity Detection | CodeCode Available | 1 |
| AISHELL-NER: Named Entity Recognition from Chinese Speech | Feb 17, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition | Oct 24, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Espresso: A Fast End-to-end Neural Speech Recognition Toolkit | Sep 18, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of Kaldi | Apr 3, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Extending Whisper with prompt tuning to target-speaker ASR | Dec 13, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data | Jan 28, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CB-Conformer: Contextual biasing Conformer for biased word recognition | Apr 19, 2023 | Automatic Speech RecognitionLanguage Modeling | CodeCode Available | 1 |
| Continuous speech separation: dataset and analysis | Jan 30, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Radically Old Way of Computing Spectra: Applications in End-to-End ASR | Mar 25, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Automatic Speech Recognition for Speech Assessment of Persian Preschool Children | Mar 24, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition | Feb 22, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications | Mar 31, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| HowToCaption: Prompting LLMs to Transform Video Annotations at Scale | Oct 7, 2023 | Automatic Speech RecognitionVideo Captioning | CodeCode Available | 1 |
| Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning | Nov 7, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Improved DeepFake Detection Using Whisper Features | Jun 2, 2023 | Automatic Speech RecognitionDeepFake Detection | CodeCode Available | 1 |
| Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder | Aug 14, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model | Jan 6, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation | Nov 9, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Automatic Speech Recognition Benchmark for Air-Traffic Communications | Jun 18, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |