| STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions | May 30, 2023 | AllAutomatic Speech Recognition | —Unverified | 0 | 0 |
| STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent | Mar 28, 2022 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Study of Indian English Pronunciation Variabilities relative to Received Pronunciation | Apr 13, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech | Nov 4, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN | Oct 27, 2023 | DecoderDenoising | —Unverified | 0 | 0 |
| Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models | Oct 6, 2021 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| StyleFusion TTS: Multimodal Style-control and Enhanced Feature Fusion for Zero-shot Text-to-speech Synthesis | Sep 24, 2024 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| Style Mixture of Experts for Expressive Text-To-Speech Synthesis | Jun 5, 2024 | Mixture-of-ExpertsSpeech Synthesis | —Unverified | 0 | 0 |
| STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech | Mar 17, 2021 | Speech SynthesisStyle Transfer | —Unverified | 0 | 0 |
| Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation | Aug 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion | Sep 16, 2024 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| Style Variation as a Vantage Point for Code-Switching | May 1, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| SupertonicTTS: Towards Highly Scalable and Efficient Text-to-Speech System | Mar 29, 2025 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition | Jun 5, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| SyncSpeech: Low-Latency and Efficient Dual-Stream Text-to-Speech based on Temporal Masked Transformer | Feb 16, 2025 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Syntactic representation learning for neural network based TTS with syntactic parse tree traversal | Dec 13, 2020 | DiversityRepresentation Learning | —Unverified | 0 | 0 |
| Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech | Nov 24, 2020 | Data AugmentationSpeaker Recognition | —Unverified | 0 | 0 |
| Synth4Kws: Synthesized Speech for User Defined Keyword Spotting in Low Resource Environments | Jul 23, 2024 | DiversityKeyword Spotting | —Unverified | 0 | 0 |
| SynthASR: Unlocking Synthetic Data for Speech Recognition | Jun 14, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition | Jan 27, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Synthesizing Personalized Non-speech Vocalization from Discrete Speech Representations | Jun 25, 2022 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Synthetic Speaking Children -- Why We Need Them and How to Make Them | Nov 8, 2023 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Synthetic Speech Detection Based on Temporal Consistency and Distribution of Speaker Features | Sep 29, 2023 | Synthetic Speech Detectiontext-to-speech | —Unverified | 0 | 0 |
| Talking Face Generation with Multilingual TTS | May 13, 2022 | Face GenerationTalking Face Generation | —Unverified | 0 | 0 |
| Talrómur: A large Icelandic TTS corpus | May 1, 2021 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Statistical Context-Dependent Units Boundary Correction for Corpus-based Unit-Selection Text-to-Speech | Mar 5, 2020 | Segmentationtext-to-speech | —Unverified | 0 | 0 |
| Teacher-Student Training for Robust Tacotron-based TTS | Nov 7, 2019 | DecoderKnowledge Distillation | —Unverified | 0 | 0 |
| Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages | Nov 1, 2022 | ChunkingRhythm | —Unverified | 0 | 0 |
| Telephone Surveys Meet Conversational AI: Evaluating a LLM-Based Telephone Survey System at Scale | Feb 27, 2025 | AI AgentLarge Language Model | —Unverified | 0 | 0 |
| Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise | Jun 13, 2019 | Data AugmentationDecoder | —Unverified | 0 | 0 |
| Teochew-Wild: The First In-the-wild Teochew Dataset with Orthographic Annotations | May 8, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Text-aware and Context-aware Expressive Audiobook Speech Synthesis | Jun 9, 2024 | Contrastive LearningLanguage Modeling | —Unverified | 0 | 0 |
| Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS | Jul 13, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Text-free non-parallel many-to-many voice conversion using normalising flows | Mar 15, 2022 | Normalising FlowsSpeech Synthesis | —Unverified | 0 | 0 |
| Text Generation with Speech Synthesis for ASR Data Augmentation | May 22, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis | Mar 27, 2023 | AllAutomatic Speech Recognition | —Unverified | 0 | 0 |
| Text Is Not All You Need: Multimodal Prompting Helps LLMs Understand Humor | Dec 1, 2024 | AllNatural Language Understanding | —Unverified | 0 | 0 |
| Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens | Oct 4, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Text-To-Speech Data Augmentation for Low Resource Speech Recognition | Apr 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Text-To-Speech for Languages without an Orthography | Dec 1, 2012 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| Text-to-Speech for Under-Resourced Languages: Phoneme Mapping and Source Language Selection in Transfer Learning | Jun 1, 2022 | Cross-Lingual Transfertext-to-speech | —Unverified | 0 | 0 |
| Text-to-Speech Pipeline for Swiss German -- A comparison | May 31, 2023 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder | Dec 16, 2022 | Representation LearningSpeech Synthesis | —Unverified | 0 | 0 |
| Text-To-Speech Synthesis In The Wild | Sep 13, 2024 | BenchmarkingSpeaker Recognition | —Unverified | 0 | 0 |
| Textual Echo Cancellation | Aug 13, 2020 | Acoustic echo cancellationspeech-recognition | —Unverified | 0 | 0 |
| The Art of Storytelling: Multi-Agent Generative AI for Dynamic Multimodal Narratives | Sep 17, 2024 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| The C-ORAL-BRASIL I: Reference Corpus for Spoken Brazilian Portuguese | May 1, 2012 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| The DeepZen Speech Synthesis System for Blizzard Challenge 2023 | Aug 30, 2023 | SentenceSpeech Synthesis | —Unverified | 0 | 0 |
| The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech | Jun 1, 2023 | Cross-Lingual Transfertext-to-speech | —Unverified | 0 | 0 |
| The FruitShell French synthesis system at the Blizzard 2023 Challenge | Sep 1, 2023 | Data AugmentationSpeech Synthesis | —Unverified | 0 | 0 |