| ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020 | May 24, 2020 | Data AugmentationDecoder | —Unverified | 0 |
| Optimization of a Real-Time Wavelet-Based Algorithm for Improving Speech Intelligibility | Feb 5, 2022 | Speech EnhancementSpeech-to-Text | —Unverified | 0 |
| OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification | Feb 20, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction | Feb 10, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling | Jun 21, 2021 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| AeGAN: Time-Frequency Speech Denoising via Generative Adversarial Networks | Oct 21, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M | Jul 6, 2023 | Speech-to-Text | —Unverified | 0 |
| PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection | Sep 13, 2023 | Adversarial AttackSpeech-to-Text | —Unverified | 0 |
| Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili | Oct 29, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Polish Read Speech Corpus for Speech Tools and Services | Jun 1, 2017 | Action DetectionActivity Detection | —Unverified | 0 |
| Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison | Jan 4, 2025 | DecoderKnowledge Distillation | —Unverified | 0 |
| Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases | Feb 1, 2024 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Punctuation restoration in Swedish through fine-tuned KB-BERT | Feb 14, 2022 | Language ModellingPunctuation Restoration | —Unverified | 0 |
| Pushing the performances of ASR models on English and Spanish accents | Dec 22, 2022 | Speech-to-Text | —Unverified | 0 |
| Recent Advances in Direct Speech-to-text Translation | Jun 20, 2023 | Data AugmentationDecoder | —Unverified | 0 |
| Representation Purification for End-to-End Speech Translation | Dec 5, 2024 | Machine TranslationRhythm | —Unverified | 0 |
| Revisiting End-to-End Speech-to-Text Translation From Scratch | Jun 9, 2022 | Decoderspeech-recognition | —Unverified | 0 |
| Revisiting the Entropy Semiring for Neural Speech Recognition | Dec 13, 2023 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking | Mar 13, 2024 | Chinese Spell CheckingIn-Context Learning | —Unverified | 0 |
| Robust Semantic Communications for Speech Transmission | Mar 8, 2024 | Generative Adversarial NetworkSemantic Communication | —Unverified | 0 |
| Role of Intonation in Scoring Spoken English | Aug 23, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks | Jul 14, 2022 | Speech-to-Text | —Unverified | 0 |
| S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation | Jun 11, 2025 | Reading ComprehensionSpeech Synthesis | —Unverified | 0 |
| SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation | Oct 13, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation | May 17, 2022 | Representation LearningRetrieval | —Unverified | 0 |
| Self-Supervised Representations Improve End-to-End Speech Translation | Jun 22, 2020 | Cross-Lingual Transferspeech-recognition | —Unverified | 0 |
| Semantic-aware Speech to Text Transmission with Redundancy Removal | Feb 7, 2022 | Semantic CommunicationSpeech-to-Text | —Unverified | 0 |
| Semantic MIMO Systems for Speech-to-Text Transmission | May 13, 2024 | Semantic CommunicationSpeech-to-Text | —Unverified | 0 |
| Semantic-preserved Communication System for Highly Efficient Speech Transmission | May 25, 2022 | Semantic Communicationspeech-recognition | —Unverified | 0 |
| Simple and Effective Unsupervised Speech Translation | Oct 18, 2022 | Domain AdaptationMachine Translation | —Unverified | 0 |
| SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation | Jun 20, 2024 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 |
| SimulSpeech: End-to-End Simultaneous Speech to Text Translation | Jul 1, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speak2Label: Using Domain Knowledge for Creating a Large Scale Driver Gaze Zone Estimation Dataset | Apr 13, 2020 | Gaze PredictionSpeech-to-Text | —Unverified | 0 |
| Speaker Independent Continuous Speech to Text Converter for Mobile Application | Jul 19, 2013 | Action DetectionActivity Detection | —Unverified | 0 |
| Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction | May 8, 2013 | Speech SynthesisSpeech-to-Text | —Unverified | 0 |
| SpeechAlign: a Framework for Speech Translation Alignment Evaluation | Sep 20, 2023 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 |
| Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody? | Oct 31, 2024 | Rhythmspeech-recognition | —Unverified | 0 |
| Speech Recognition Web Services for Dutch | May 1, 2014 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speech to Speech Translation with Translatotron: A State of the Art Review | Feb 9, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation | May 17, 2020 | Computational Efficiencyspeech-recognition | —Unverified | 0 |
| Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding | Jun 8, 2023 | dialog state trackingLanguage Modeling | —Unverified | 0 |
| Speech-to-Text and Evaluation of Multiple Machine Translation Systems | Sep 1, 2022 | Machine TranslationSpeech-to-Text | —Unverified | 0 |
| Speech to text and text to speech recognition systems-Areview | Mar 17, 2018 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios | May 30, 2025 | Cross-Lingual TransferPhoneme Recognition | —Unverified | 0 |
| Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing? | Feb 19, 2024 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 |
| SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation | Nov 3, 2024 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| SpiCE: A New Open-Access Corpus of Conversational Bilingual Speech in Cantonese and English | May 1, 2020 | SentenceSpeech-to-Text | —Unverified | 0 |
| Strategies for improving low resource speech to text translation relying on pre-trained ASR models | May 31, 2023 | Automatic Speech RecognitionDecoder | —Unverified | 0 |
| StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection | Jun 10, 2024 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 |
| STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions | May 30, 2023 | AllAutomatic Speech Recognition | —Unverified | 0 |