| Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition | Jan 27, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Synthesizing Personalized Non-speech Vocalization from Discrete Speech Representations | Jun 25, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| Synthetic Speaking Children -- Why We Need Them and How to Make Them | Nov 8, 2023 | text-to-speechText to Speech | —Unverified | 0 |
| Synthetic Speech Detection Based on Temporal Consistency and Distribution of Speaker Features | Sep 29, 2023 | Synthetic Speech Detectiontext-to-speech | —Unverified | 0 |
| Talking Face Generation with Multilingual TTS | May 13, 2022 | Face GenerationTalking Face Generation | —Unverified | 0 |
| Talrómur: A large Icelandic TTS corpus | May 1, 2021 | text-to-speechText to Speech | —Unverified | 0 |
| Statistical Context-Dependent Units Boundary Correction for Corpus-based Unit-Selection Text-to-Speech | Mar 5, 2020 | Segmentationtext-to-speech | —Unverified | 0 |
| Teacher-Student Training for Robust Tacotron-based TTS | Nov 7, 2019 | DecoderKnowledge Distillation | —Unverified | 0 |
| Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages | Nov 1, 2022 | ChunkingRhythm | —Unverified | 0 |
| Telephone Surveys Meet Conversational AI: Evaluating a LLM-Based Telephone Survey System at Scale | Feb 27, 2025 | AI AgentLarge Language Model | —Unverified | 0 |
| Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise | Jun 13, 2019 | Data AugmentationDecoder | —Unverified | 0 |
| Teochew-Wild: The First In-the-wild Teochew Dataset with Orthographic Annotations | May 8, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Text-aware and Context-aware Expressive Audiobook Speech Synthesis | Jun 9, 2024 | Contrastive LearningLanguage Modeling | —Unverified | 0 |
| Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS | Jul 13, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Text-free non-parallel many-to-many voice conversion using normalising flows | Mar 15, 2022 | Normalising FlowsSpeech Synthesis | —Unverified | 0 |
| Text Generation with Speech Synthesis for ASR Data Augmentation | May 22, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis | Mar 27, 2023 | AllAutomatic Speech Recognition | —Unverified | 0 |
| Text Is Not All You Need: Multimodal Prompting Helps LLMs Understand Humor | Dec 1, 2024 | AllNatural Language Understanding | —Unverified | 0 |
| Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens | Oct 4, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Text-To-Speech Data Augmentation for Low Resource Speech Recognition | Apr 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Text-To-Speech for Languages without an Orthography | Dec 1, 2012 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Text-to-Speech for Under-Resourced Languages: Phoneme Mapping and Source Language Selection in Transfer Learning | Jun 1, 2022 | Cross-Lingual Transfertext-to-speech | —Unverified | 0 |
| Text-to-Speech Pipeline for Swiss German -- A comparison | May 31, 2023 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder | Dec 16, 2022 | Representation LearningSpeech Synthesis | —Unverified | 0 |
| Text-To-Speech Synthesis In The Wild | Sep 13, 2024 | BenchmarkingSpeaker Recognition | —Unverified | 0 |
| Textual Echo Cancellation | Aug 13, 2020 | Acoustic echo cancellationspeech-recognition | —Unverified | 0 |
| The Art of Storytelling: Multi-Agent Generative AI for Dynamic Multimodal Narratives | Sep 17, 2024 | text-to-speechText to Speech | —Unverified | 0 |
| The C-ORAL-BRASIL I: Reference Corpus for Spoken Brazilian Portuguese | May 1, 2012 | text-to-speechText to Speech | —Unverified | 0 |
| The DeepZen Speech Synthesis System for Blizzard Challenge 2023 | Aug 30, 2023 | SentenceSpeech Synthesis | —Unverified | 0 |
| The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech | Jun 1, 2023 | Cross-Lingual Transfertext-to-speech | —Unverified | 0 |
| The FruitShell French synthesis system at the Blizzard 2023 Challenge | Sep 1, 2023 | Data AugmentationSpeech Synthesis | —Unverified | 0 |
| The ILMT-s2s Corpus ― A Multimodal Interlingual Map Task Corpus | May 1, 2016 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| The Impact of Silence on Speech Anti-Spoofing | Sep 21, 2023 | Action DetectionActivity Detection | —Unverified | 0 |
| The MSXF TTS System for ICASSP 2022 ADD Challenge | Jan 27, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| The Nós Project: Opening routes for the Galician language in the field of language technologies | Jun 1, 2022 | Cultural Vocal Bursts Intensity PredictionMachine Translation | —Unverified | 0 |
| The NTU-AISG Text-to-speech System for Blizzard Challenge 2020 | Oct 22, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance | Apr 11, 2022 | Speaker VerificationSpeech Synthesis | —Unverified | 0 |
| The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach | Oct 14, 2019 | Expressive Speech SynthesisSociology | —Unverified | 0 |
| The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains | Oct 4, 2023 | Speech Synthesistext-to-speech | —Unverified | 0 |
| The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge | Apr 9, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain | Jun 3, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| TI-ASU: Toward Robust Automatic Speech Understanding through Text-to-speech Imputation Against Missing Speech Modality | Apr 27, 2024 | Imputationtext-to-speech | —Unverified | 0 |
| T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation | May 24, 2022 | DecoderMachine Translation | —Unverified | 0 |
| Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion | Apr 6, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Total-Duration-Aware Duration Modeling for Text-to-Speech Systems | Jun 6, 2024 | Diversitytext-to-speech | —Unverified | 0 |
| Towards Accurate Lip-to-Speech Synthesis in-the-Wild | Mar 2, 2024 | Language ModellingLip to Speech Synthesis | —Unverified | 0 |
| Towards a Japanese Full-duplex Spoken Dialogue System | Jun 3, 2025 | Spoken Dialogue Systemstext-to-speech | —Unverified | 0 |
| Towards a Language Service Infrastructure for Mobile Environments | May 1, 2016 | text-to-speechText to Speech | —Unverified | 0 |
| Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer | May 15, 2024 | Adversarial AttackAutomatic Speech Recognition | —Unverified | 0 |
| Towards Flow-Matching-based TTS without Classifier-Free Guidance | Apr 29, 2025 | Speech Synthesistext-to-speech | —Unverified | 0 |