| Cross Attention Augmented Transducer Networks for Simultaneous Translation | Nov 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Information-Transport-based Policy for Simultaneous Translation | Oct 22, 2022 | Machine TranslationSpeech-to-Text | CodeCode Available | 1 | 5 |
| LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models | Jul 22, 2024 | Data AugmentationLanguage Modeling | CodeCode Available | 1 | 5 |
| Benchmarking Large Multimodal Models against Common Corruptions | Jan 22, 2024 | BenchmarkingImage to text | CodeCode Available | 1 | 5 |
| One TTS Alignment To Rule Them All | Aug 23, 2021 | AllSpeech Synthesis | CodeCode Available | 1 | 5 |
| Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement | Dec 21, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Brilla AI: AI Contestant for the National Science and Maths Quiz | Mar 4, 2024 | MathQuestion Answering | CodeCode Available | 1 | 5 |
| Kosp2e: Korean Speech to English Translation Corpus | Jul 6, 2021 | speech-recognitionSpeech Recognition | CodeCode Available | 1 | 5 |
| "Listen, Understand and Translate": Triple Supervision Decouples End-to-end Speech-to-text Translation | Sep 21, 2020 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 1 | 5 |
| Late reverberation suppression using U-nets | Oct 5, 2021 | DecoderSpeech Dereverberation | CodeCode Available | 1 | 5 |
| ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs | Jun 26, 2024 | ArzEn Code-switched Translation to araArzEn Code-switched Translation to eng | CodeCode Available | 1 | 5 |
| LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions | May 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing | Mar 18, 2022 | Representation LearningSpeaker Verification | CodeCode Available | 1 | 5 |
| ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation | May 24, 2023 | GPULanguage Modeling | CodeCode Available | 1 | 5 |
| CoVoST 2 and Massively Multilingual Speech-to-Text Translation | Jul 20, 2020 | Machine Translationspeech-recognition | CodeCode Available | 1 | 5 |
| CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus | Feb 4, 2020 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 1 | 5 |
| Careless Whisper: Speech-to-Text Hallucination Harms | Feb 12, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 | 5 |
| Calibrated SVM for Probabilistic Classification of In-Vehicle Voices into Vehicle Commands via Voice-to-Text LLM Transformation | Jun 28, 2024 | Speech-to-Texttext-classification | CodeCode Available | 0 | 5 |
| Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning | Sep 21, 2016 | DecoderMulti-Task Learning | CodeCode Available | 0 | 5 |
| Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision | Dec 30, 2023 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 | 5 |
| Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy | Oct 13, 2022 | Generative Adversarial NetworkSpeaker anonymization | CodeCode Available | 0 | 5 |
| Kurdish (Sorani) Speech to Text: Presenting an Experimental Dataset | Nov 29, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 | 5 |
| Let's Give a Voice to Conversational Agents in Virtual Reality | Aug 4, 2023 | Speech-to-Texttext-to-speech | CodeCode Available | 0 | 5 |
| BeaverTalk: Oregon State University's IWSLT 2025 Simultaneous Speech Translation System | May 29, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 | 5 |
| Infusing Future Information into Monotonic Attention Through Language Models | Sep 7, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |