| Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System | Apr 20, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| Data Redaction from Conditional Generative Models | May 18, 2023 | text-to-speechText to Speech | —Unverified | 0 |
| D-CAPTCHA++: A Study of Resilience of Deepfake CAPTCHA under Transferable Imperceptible Adversarial Attack | Sep 11, 2024 | Adversarial AttackAudio Synthesis | —Unverified | 0 |
| Towards Selection of Text-to-speech Data to Augment ASR Training | May 30, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis | Aug 31, 2023 | Expressive Speech SynthesisSentence | —Unverified | 0 |
| Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models | Jun 17, 2019 | DecoderSpeech Synthesis | —Unverified | 0 |
| Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders | Oct 28, 2022 | Speaker Verificationtext-to-speech | —Unverified | 0 |
| Towards Zero-Shot Text-To-Speech for Arabic Dialects | Jun 24, 2024 | Dialect IdentificationSpeech Synthesis | —Unverified | 0 |
| Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora | Apr 1, 2019 | text-to-speechText to Speech | —Unverified | 0 |
| Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems | Sep 4, 2024 | text-to-speechText to Speech | —Unverified | 0 |
| Training Wake Word Detection with Synthesized Speech Data on Confusion Words | Nov 3, 2020 | Data AugmentationKeyword Spotting | —Unverified | 0 |
| Transcript-Prompted Whisper with Dictionary-Enhanced Decoding for Japanese Speech Annotation | Jun 9, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction | Nov 6, 2023 | text-to-speechText to Speech | —Unverified | 0 |
| Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus | Mar 29, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| Transfer the linguistic representations from TTS to accent conversion with non-parallel data | Jan 7, 2024 | text-to-speechText to Speech | —Unverified | 0 |
| Transformer-based Models of Text Normalization for Speech Applications | Feb 1, 2022 | SentenceSpeech Synthesis | —Unverified | 0 |
| Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis | Jul 25, 2022 | Data AugmentationSpeech Synthesis | —Unverified | 0 |
| Triple M: A Practical Text-to-speech Synthesis System With Multi-guidance Attention And Multi-band Multi-time LPCNet | Jan 30, 2021 | CPUSentence | —Unverified | 0 |
| TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder | Jun 30, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| TTSDS2: Resources and Benchmark for Evaluating Human-Quality Text to Speech Systems | Jun 24, 2025 | text-to-speechText to Speech | —Unverified | 0 |
| TTS for Low Resource Languages: A Bangla Synthesizer | May 1, 2016 | Text Normalizationtext-to-speech | —Unverified | 0 |
| TTS-Guided Training for Accent Conversion Without Parallel Data | Dec 20, 2022 | Decodertext-to-speech | —Unverified | 0 |
| TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations | Jul 2, 2024 | Benchmarkingtext-to-speech | —Unverified | 0 |
| TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer | Jan 10, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| UmbraTTS: Adapting Text-to-Speech to Environmental Contexts with Flow Matching | Jun 11, 2025 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Une aide \`a la communication par pictogrammes avec pr\'ediction s\'emantique | Jun 1, 2015 | text-to-speechText to Speech | —Unverified | 0 |
| UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation | Jun 4, 2025 | cross-modal alignmentLipreading | —Unverified | 0 |
| Unified speech and gesture synthesis using flow matching | Oct 8, 2023 | Audio SynthesisMotion Synthesis | —Unverified | 0 |
| UniFLG: Unified Facial Landmark Generator from Text or Speech | Feb 28, 2023 | DecoderFace Generation | —Unverified | 0 |
| Unify and Conquer: How Phonetic Feature Representation Affects Polyglot Text-To-Speech (TTS) | Jul 4, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion | Jan 10, 2023 | Quantizationtext-to-speech | —Unverified | 0 |
| UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation | Mar 2, 2025 | DecoderRepresentation Learning | —Unverified | 0 |
| Unsupervised Data Validation Methods for Efficient Model Training | Oct 10, 2024 | Data Augmentationmodel | —Unverified | 0 |
| Unsupervised Learning For Sequence-to-sequence Text-to-speech For Low-resource Languages | Aug 11, 2020 | Quantizationtext-to-speech | —Unverified | 0 |
| Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis | Oct 1, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Unsupervised Polyglot Text To Speech | Feb 6, 2019 | text-to-speechText to Speech | —Unverified | 0 |
| Unsupervised pre-training for sequence to sequence speech recognition | Oct 28, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Unsupervised Quantized Prosody Representation for Controllable Speech Synthesis | Apr 7, 2022 | QuantizationSpeech Synthesis | —Unverified | 0 |
| Unsupervised word-level prosody tagging for controllable speech synthesis | Feb 15, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Controllable Speaking Styles Using a Large Language Model | May 17, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Using Audio Books for Training a Text-to-Speech System | May 1, 2014 | DiversitySpeech Synthesis | —Unverified | 0 |
| Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition | Jan 6, 2023 | Domain AdaptationGPU | —Unverified | 0 |
| Using IPA-Based Tacotron for Data Efficient Cross-Lingual Speaker Adaptation and Pronunciation Enhancement | Nov 12, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| Using previous acoustic context to improve Text-to-Speech synthesis | Dec 7, 2020 | DecoderSpeech Synthesis | —Unverified | 0 |
| Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset | Sep 14, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems | Nov 23, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Using the LARA Little Prince to compare human and TTS audio quality | Jun 1, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech | Nov 28, 2019 | DisentanglementExpressive Speech Synthesis | —Unverified | 0 |
| Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction | Jan 3, 2024 | text-to-speechText to Speech | —Unverified | 0 |
| Utilizing Speech Emotion Recognition and Recommender Systems for Negative Emotion Handling in Therapy Chatbots | Nov 18, 2023 | ChatbotEmotion Recognition | —Unverified | 0 |