| Seamless: Multilingual Expressive and Streaming Speech Translation | Dec 8, 2023 | automatic-speech-translationMachine Translation | CodeCode Available | 6 |
| MooER: LLM-based Speech Recognition and Translation Models from Moore Threads | Aug 9, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| BhasaAnuvaad: A Speech Translation Dataset for 13 Indian Languages | Nov 7, 2024 | automatic-speech-translationSynthetic Data Generation | CodeCode Available | 1 |
| LESS: Large Language Model Enhanced Semi-Supervised Learning for Speech Foundational Models | Jun 5, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Word Level Timestamp Generation for Automatic Speech Recognition and Translation | May 21, 2025 | Automatic Speech Recognitionautomatic-speech-translation | CodeCode Available | 0 |
| Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities | May 13, 2025 | automatic-speech-translationBenchmarking | —Unverified | 0 |
| EMMeTT: Efficient Multimodal Machine Translation Training | Sep 20, 2024 | automatic-speech-translationDecoder | —Unverified | 0 |
| Chain-of-Thought Prompting for Speech Translation | Sep 17, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Ideal-LLM: Integrating Dual Encoders and Language-Adapted LLM for Multilingual Speech-to-Text | Sep 17, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts | Jul 17, 2023 | automatic-speech-translationImitation Learning | CodeCode Available | 0 |