| An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation | Aug 28, 2023 | Machine TranslationNMT | CodeCode Available | 0 |
| SONAR: Sentence-Level Multimodal and Language-Agnostic Representations | Aug 22, 2023 | DecoderMachine Translation | CodeCode Available | 2 |
| SeamlessM4T: Massively Multilingual & Multimodal Machine Translation | Aug 22, 2023 | Automatic Speech RecognitionMachine Translation | CodeCode Available | 2 |
| On decoder-only architecture for speech-to-text and large language model integration | Jul 8, 2023 | DecoderLanguage Modeling | —Unverified | 0 |
| AudioPaLM: A Large Language Model That Can Speak and Listen | Jun 22, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Recent Advances in Direct Speech-to-text Translation | Jun 20, 2023 | Data AugmentationDecoder | —Unverified | 0 |
| Improved Cross-Lingual Transfer Learning For Automatic Speech Translation | Jun 1, 2023 | automatic-speech-translationCross-Lingual Transfer | —Unverified | 0 |
| Strategies for improving low resource speech to text translation relying on pre-trained ASR models | May 31, 2023 | Automatic Speech RecognitionDecoder | —Unverified | 0 |
| ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation | May 24, 2023 | GPULanguage Modeling | CodeCode Available | 1 |
| DUB: Discrete Unit Back-translation for Speech Translation | May 19, 2023 | Machine TranslationSpeech-to-Text | CodeCode Available | 1 |
| Back Translation for Speech-to-text Translation Without Transcripts | May 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks | May 4, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit | Apr 10, 2023 | BenchmarkingSimultaneous Speech-to-Text Translation | CodeCode Available | 0 |
| Enhancing Speech-to-Speech Translation with Multiple TTS Targets | Apr 10, 2023 | Speech-to-Speech TranslationSpeech-to-Text | —Unverified | 0 |
| Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages | Mar 2, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation | Mar 1, 2023 | Audio-Visual Speech RecognitionRobust Speech Recognition | CodeCode Available | 2 |
| Pre-training for Speech Translation: CTC Meets Optimal Transport | Jan 27, 2023 | Multi-Task LearningSpeech-to-Text | CodeCode Available | 1 |
| WACO: Word-Aligned Contrastive Learning for Speech Translation | Dec 19, 2022 | Contrastive LearningSpeech-to-Text | CodeCode Available | 0 |
| M3ST: Mix at Three Levels for Speech Translation | Dec 7, 2022 | Data AugmentationDiversity | —Unverified | 0 |
| Efficient Speech Translation with Dynamic Latent Perceivers | Oct 28, 2022 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 |
| Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation | Oct 24, 2022 | SegmentationSpeech-to-Text | CodeCode Available | 0 |
| Simple and Effective Unsupervised Speech Translation | Oct 18, 2022 | Domain AdaptationMachine Translation | —Unverified | 0 |
| CTC Alignments Improve Autoregressive Translation | Oct 11, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation | Jul 3, 2022 | DecoderSpeech-to-Text | CodeCode Available | 0 |
| Language Model Augmented Monotonic Attention for Simultaneous Translation | Jul 1, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Revisiting End-to-End Speech-to-Text Translation From Scratch | Jun 9, 2022 | Decoderspeech-recognition | CodeCode Available | 0 |
| PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit | May 20, 2022 | AllAutomatic Speech Recognition (ASR) | CodeCode Available | 6 |
| SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation | May 17, 2022 | Representation LearningRetrieval | —Unverified | 0 |
| Cross-modal Contrastive Learning for Speech Translation | May 5, 2022 | Contrastive LearningRetrieval | CodeCode Available | 1 |
| Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages | May 2, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation | May 1, 2022 | SegmentationSimultaneous Speech-to-Text Translation | —Unverified | 0 |
| NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022 | May 1, 2022 | SegmentationSimultaneous Speech-to-Text Translation | —Unverified | 0 |
| LibriS2S: A German-English Speech-to-Speech Translation Corpus | Apr 22, 2022 | Speech-to-Speech TranslationSpeech-to-Text | CodeCode Available | 0 |
| Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation | Apr 6, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| XTREME-S: Evaluating Cross-lingual Speech Representations | Mar 21, 2022 | Representation LearningRetrieval | —Unverified | 0 |
| STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation | Mar 20, 2022 | Machine TranslationSpeech-to-Text | CodeCode Available | 1 |
| A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing | Mar 18, 2022 | Representation LearningSpeaker Verification | CodeCode Available | 1 |
| SHAS: Approaching optimal Segmentation for End-to-End Speech Translation | Feb 9, 2022 | SegmentationSpeech-to-Text Translation | CodeCode Available | 1 |
| CVSS Corpus and Massively Multilingual Speech-to-Speech Translation | Jan 11, 2022 | SentenceSpeech-to-Speech Translation | CodeCode Available | 2 |
| Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement | Dec 21, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Cross-modal Contrastive Learning for Speech Translation | Dec 17, 2021 | Contrastive LearningRetrieval | —Unverified | 0 |
| Improve Sinhala Speech Recognition Through e2e LF-MMI Model | Dec 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| An Experiment on Speech-to-Text Translation Systems for Manipuri to English on Low Resource Setting | Dec 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems | Oct 13, 2021 | SentenceSimultaneous Speech-to-Text Translation | —Unverified | 0 |
| Learning When to Translate for Streaming Speech | Sep 15, 2021 | DecoderSentence | CodeCode Available | 1 |
| Speechformer: Reducing Information Loss in Direct Speech Translation | Sep 9, 2021 | Speech-to-Text TranslationTranslation | CodeCode Available | 0 |
| Infusing Future Information into Monotonic Attention Through Language Models | Sep 7, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task | Jul 12, 2021 | DecoderKnowledge Distillation | —Unverified | 0 |
| Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling | Jun 21, 2021 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR | Jun 11, 2021 | Simultaneous Speech-to-Text TranslationSpeech-to-Text | —Unverified | 0 |