| DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities | Feb 16, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Earnings-22: A Practical Benchmark for Accents in the Wild | Mar 29, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition | Aug 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| EnCodecMAE: Leveraging neural codecs for universal audio representation learning | Sep 14, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| End-to-end Named Entity Recognition from English Speech | May 22, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Adaptation of Whisper models to child speech recognition | Jul 24, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Cross-modal information fusion for voice spoofing detection | Feb 1, 2023 | Automatic Speech Recognitionfake voice detection | CodeCode Available | 1 |
| End-to-End Speech Recognition from Federated Acoustic Models | Apr 29, 2021 | 2k4k | CodeCode Available | 1 |
| Adapting End-to-End Speech Recognition for Readable Subtitles | May 25, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Enhancing Multimodal Sentiment Analysis for Missing Modality through Self-Distillation and Unified Modality Cross-Attention | Oct 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Extending Whisper with prompt tuning to target-speaker ASR | Dec 13, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Factorized Neural Transducer for Efficient Language Model Adaptation | Sep 27, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition | May 16, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| CTC-synchronous Training for Monotonic Attention Model | May 10, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese | Oct 14, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| FlanEC: Exploring Flan-T5 for Post-ASR Error Correction | Jan 22, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Cross Attention Augmented Transducer Networks for Simultaneous Translation | Nov 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| D4AM: A General Denoising Framework for Downstream Acoustic Models | Nov 28, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Framework for Curating Speech Datasets and Evaluating ASR Systems: A Case Study for Polish | Jul 18, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| How2: A Large-scale Dataset for Multimodal Language Understanding | Nov 1, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| HowToCaption: Prompting LLMs to Transform Video Annotations at Scale | Oct 7, 2023 | Automatic Speech RecognitionVideo Captioning | CodeCode Available | 1 |
| HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models | Sep 27, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Improved DeepFake Detection Using Whisper Features | Jun 2, 2023 | Automatic Speech RecognitionDeepFake Detection | CodeCode Available | 1 |
| Improved Noisy Student Training for Automatic Speech Recognition | May 19, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model | Jan 6, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Improving Mandarin Speech Recogntion with Block-augmented Transformer | Jul 24, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| DiaCorrect: Error Correction Back-end For Speaker Diarization | Sep 15, 2023 | Automatic Speech RecognitionDecoder | CodeCode Available | 1 |
| Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech | Jun 16, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context | May 7, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Continuous speech separation: dataset and analysis | Jan 30, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition | Oct 22, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI | Dec 5, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models | Jul 5, 2024 | Adversarial AttackAutomatic Speech Recognition | CodeCode Available | 1 |
| Combining Frame-Synchronous and Label-Synchronous Systems for Speech Recognition | Jul 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Accented Speech Recognition With Accent-specific Codebooks | Oct 24, 2023 | Accented Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| Common Voice: A Massively-Multilingual Speech Corpus | Dec 13, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition | Feb 2, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CopyNE: Better Contextual ASR by Copying Named Entities | May 22, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications | Apr 30, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| A context-aware knowledge transferring strategy for CTC-based ASR | Oct 12, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| CB-Conformer: Contextual biasing Conformer for biased word recognition | Apr 19, 2023 | Automatic Speech RecognitionLanguage Modeling | CodeCode Available | 1 |
| Can Contextual Biasing Remain Effective with Whisper and GPT-2? | Jun 2, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation | Oct 24, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 |
| Can we use Common Voice to train a Multi-Speaker TTS system? | Oct 12, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data | Jan 28, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications | Oct 12, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 |
| BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithm | Dec 11, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Advancing Test-Time Adaptation in Wild Acoustic Test Settings | Oct 14, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| BembaSpeech: A Speech Recognition Corpus for the Bemba Language | Feb 9, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| AVLnet: Learning Audio-Visual Language Representations from Instructional Videos | Jun 16, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |