| Calibrated SVM for Probabilistic Classification of In-Vehicle Voices into Vehicle Commands via Voice-to-Text LLM Transformation | Jun 28, 2024 | Speech-to-Texttext-classification | CodeCode Available | 0 |
| Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects | Jun 27, 2024 | Automatic Speech RecognitionMachine Translation | CodeCode Available | 0 |
| ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs | Jun 26, 2024 | ArzEn Code-switched Translation to araArzEn Code-switched Translation to eng | CodeCode Available | 1 |
| Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet | Jun 25, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Revisiting Interpolation Augmentation for Speech-to-Text Generation | Jun 22, 2024 | Speech-to-TextText Generation | CodeCode Available | 1 |
| SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation | Jun 20, 2024 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 |
| Transferable speech-to-text large language model alignment module | Jun 19, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving | Jun 16, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation | Jun 14, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models | Jun 13, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |