| Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision | Dec 30, 2023 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 | 5 |
| LibriS2S: A German-English Speech-to-Speech Translation Corpus | Apr 22, 2022 | Speech-to-Speech TranslationSpeech-to-Text | CodeCode Available | 0 | 5 |
| Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models | Jul 9, 2024 | coreference-resolutionCoreference Resolution | CodeCode Available | 0 | 5 |
| Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation | Dec 6, 2016 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 | 5 |
| M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation | Jul 3, 2022 | DecoderSpeech-to-Text | CodeCode Available | 0 | 5 |
| Pre-training on high-resource speech recognition improves low-resource speech-to-text translation | Sep 5, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 | 5 |
| Revisiting End-to-End Speech-to-Text Translation From Scratch | Jun 9, 2022 | Decoderspeech-recognition | CodeCode Available | 0 | 5 |
| SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation | Jun 20, 2024 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 | 5 |
| SparQLe: Speech Queries to Text Translation Through LLMs | Feb 13, 2025 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 | 5 |
| Speechformer: Reducing Information Loss in Direct Speech Translation | Sep 9, 2021 | Speech-to-Text TranslationTranslation | CodeCode Available | 0 | 5 |
| StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection | Jun 10, 2024 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 | 5 |
| Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding | Dec 16, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 | 5 |
| Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects | Jun 27, 2024 | Automatic Speech RecognitionMachine Translation | CodeCode Available | 0 | 5 |
| WACO: Word-Aligned Contrastive Learning for Speech Translation | Dec 19, 2022 | Contrastive LearningSpeech-to-Text | CodeCode Available | 0 | 5 |
| The USFD Spoken Language Translation System for IWSLT 2014 | Sep 13, 2015 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Instance-Based Model Adaptation For Direct Speech Translation | Oct 23, 2019 | Domain AdaptationSpeech-to-Text | —Unverified | 0 | 0 |
| Interpreting Strategies Annotation in the WAW Corpus | Sep 1, 2017 | Machine TranslationSpeech-to-Text | —Unverified | 0 | 0 |
| Investigating Decoder-only Large Language Models for Speech-to-text Translation | Jul 3, 2024 | Decoderparameter-efficient fine-tuning | —Unverified | 0 | 0 |
| End-to-End Speech-to-Text Translation: A Survey | Dec 2, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Towards Measuring Fairness in AI: the Casual Conversations Dataset | Apr 6, 2021 | Age And Gender ClassificationDeepFake Detection | —Unverified | 0 | 0 |
| Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages | Nov 11, 2024 | DecoderMachine Translation | —Unverified | 0 | 0 |
| Language Model Augmented Monotonic Attention for Simultaneous Translation | Jul 1, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| End-to-End Offline Speech Translation System for IWSLT 2020 using Modality Agnostic Meta-Learning | Jul 1, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Efficient Monotonic Multihead Attention | Dec 7, 2023 | Simultaneous Speech-to-Text TranslationSpeech-to-Text | —Unverified | 0 | 0 |
| Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation | May 1, 2022 | SegmentationSimultaneous Speech-to-Text Translation | —Unverified | 0 | 0 |
| Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR | Jun 11, 2021 | Simultaneous Speech-to-Text TranslationSpeech-to-Text | —Unverified | 0 | 0 |
| Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation | Nov 5, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Towards speech-to-text translation without speech recognition | Feb 13, 2017 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems | Oct 13, 2021 | SentenceSimultaneous Speech-to-Text Translation | —Unverified | 0 | 0 |
| Towards the evaluation of automatic simultaneous speech translation from a communicative perspective | Mar 15, 2021 | automatic-speech-translationInformativeness | —Unverified | 0 | 0 |
| Towards Unsupervised Speech-to-Text Translation | Nov 4, 2018 | DenoisingLanguage Modeling | —Unverified | 0 | 0 |
| Data Efficient Direct Speech-to-Text Translation with Modality Agnostic Meta-Learning | Nov 11, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Low-Resource Speech-to-Text Translation | Mar 24, 2018 | DecoderMachine Translation | —Unverified | 0 | 0 |
| M3ST: Mix at Three Levels for Speech Translation | Dec 7, 2022 | Data AugmentationDiversity | —Unverified | 0 | 0 |
| Analyzing ASR pretraining for low-resource speech-to-text translation | Oct 23, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| MAM: Masked Acoustic Modeling for End-to-End Speech-to-Text Translation | Oct 22, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| CTC Alignments Improve Autoregressive Translation | Oct 11, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer | Oct 5, 2023 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 | 0 |
| Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing | Sep 27, 2023 | DecoderMachine Translation | —Unverified | 0 | 0 |
| NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022 | May 1, 2022 | SegmentationSimultaneous Speech-to-Text Translation | —Unverified | 0 | 0 |
| NAIST Simultaneous Speech Translation System for IWSLT 2024 | Jun 30, 2024 | Speech-to-Speech TranslationSpeech-to-Text | —Unverified | 0 | 0 |
| Cross-modal Contrastive Learning for Speech Translation | Dec 17, 2021 | Contrastive LearningRetrieval | —Unverified | 0 | 0 |
| Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision | Feb 26, 2025 | Audio SynthesisAutomatic Speech Recognition | —Unverified | 0 | 0 |
| On decoder-only architecture for speech-to-text and large language model integration | Jul 8, 2023 | DecoderLanguage Modeling | —Unverified | 0 | 0 |
| COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning | Nov 3, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling | Jun 21, 2021 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Contextual Biasing to Improve Domain-specific Custom Vocabulary Audio Transcription without Explicit Fine-Tuning of Whisper Model | Oct 24, 2024 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces | May 18, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases | Feb 1, 2024 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Compact Speech Translation Models via Discrete Speech Units Pretraining | Feb 29, 2024 | DecoderSelf-Supervised Learning | —Unverified | 0 | 0 |