| The ILMT-s2s Corpus ― A Multimodal Interlingual Map Task Corpus | May 1, 2016 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| The Impact of Silence on Speech Anti-Spoofing | Sep 21, 2023 | Action DetectionActivity Detection | —Unverified | 0 | 0 |
| The MSXF TTS System for ICASSP 2022 ADD Challenge | Jan 27, 2022 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| The Nós Project: Opening routes for the Galician language in the field of language technologies | Jun 1, 2022 | Cultural Vocal Bursts Intensity PredictionMachine Translation | —Unverified | 0 | 0 |
| The NTU-AISG Text-to-speech System for Blizzard Challenge 2020 | Oct 22, 2020 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance | Apr 11, 2022 | Speaker VerificationSpeech Synthesis | —Unverified | 0 | 0 |
| The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach | Oct 14, 2019 | Expressive Speech SynthesisSociology | —Unverified | 0 | 0 |
| The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains | Oct 4, 2023 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge | Apr 9, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain | Jun 3, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| TI-ASU: Toward Robust Automatic Speech Understanding through Text-to-speech Imputation Against Missing Speech Modality | Apr 27, 2024 | Imputationtext-to-speech | —Unverified | 0 | 0 |
| T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation | May 24, 2022 | DecoderMachine Translation | —Unverified | 0 | 0 |
| Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion | Apr 6, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Total-Duration-Aware Duration Modeling for Text-to-Speech Systems | Jun 6, 2024 | Diversitytext-to-speech | —Unverified | 0 | 0 |
| Towards Accurate Lip-to-Speech Synthesis in-the-Wild | Mar 2, 2024 | Language ModellingLip to Speech Synthesis | —Unverified | 0 | 0 |
| Towards a Japanese Full-duplex Spoken Dialogue System | Jun 3, 2025 | Spoken Dialogue Systemstext-to-speech | —Unverified | 0 | 0 |
| Towards a Language Service Infrastructure for Mobile Environments | May 1, 2016 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer | May 15, 2024 | Adversarial AttackAutomatic Speech Recognition | —Unverified | 0 | 0 |
| Towards Flow-Matching-based TTS without Classifier-Free Guidance | Apr 29, 2025 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| Towards Fully Automatic Annotation of Audio Books for TTS | May 1, 2012 | Speech RecognitionSpeech Synthesis | —Unverified | 0 | 0 |
| Towards human-like spoken dialogue generation between AI agents from written dialogue | Oct 2, 2023 | Dialogue Generationtext-to-speech | —Unverified | 0 | 0 |
| Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement | Jan 15, 2025 | Computational EfficiencyCPU | —Unverified | 0 | 0 |
| Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale | Aug 21, 2022 | LipreadingLip Reading | —Unverified | 0 | 0 |
| Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram | Feb 3, 2021 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion | Oct 16, 2020 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| Towards Optimizing OCR for Accessibility | Jun 21, 2022 | Optical Character Recognition (OCR)text-to-speech | —Unverified | 0 | 0 |
| Towards Robust FastSpeech 2 by Modelling Residual Multimodality | Jun 2, 2023 | Decodertext-to-speech | —Unverified | 0 | 0 |
| Towards Robust Neural Vocoding for Speech Generation: A Survey | Dec 5, 2019 | Speech SynthesisSurvey | —Unverified | 0 | 0 |
| Towards Selection of Text-to-speech Data to Augment ASR Training | May 30, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis | Aug 31, 2023 | Expressive Speech SynthesisSentence | —Unverified | 0 | 0 |
| Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models | Jun 17, 2019 | DecoderSpeech Synthesis | —Unverified | 0 | 0 |
| Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders | Oct 28, 2022 | Speaker Verificationtext-to-speech | —Unverified | 0 | 0 |
| Towards Zero-Shot Text-To-Speech for Arabic Dialects | Jun 24, 2024 | Dialect IdentificationSpeech Synthesis | —Unverified | 0 | 0 |
| Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora | Apr 1, 2019 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems | Sep 4, 2024 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Training Wake Word Detection with Synthesized Speech Data on Confusion Words | Nov 3, 2020 | Data AugmentationKeyword Spotting | —Unverified | 0 | 0 |
| Transcript-Prompted Whisper with Dictionary-Enhanced Decoding for Japanese Speech Annotation | Jun 9, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction | Nov 6, 2023 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus | Mar 29, 2022 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Transfer the linguistic representations from TTS to accent conversion with non-parallel data | Jan 7, 2024 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Transformer-based Models of Text Normalization for Speech Applications | Feb 1, 2022 | SentenceSpeech Synthesis | —Unverified | 0 | 0 |
| Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis | Jul 25, 2022 | Data AugmentationSpeech Synthesis | —Unverified | 0 | 0 |
| Triple M: A Practical Text-to-speech Synthesis System With Multi-guidance Attention And Multi-band Multi-time LPCNet | Jan 30, 2021 | CPUSentence | —Unverified | 0 | 0 |
| TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder | Jun 30, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| TTSDS2: Resources and Benchmark for Evaluating Human-Quality Text to Speech Systems | Jun 24, 2025 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| TTS for Low Resource Languages: A Bangla Synthesizer | May 1, 2016 | Text Normalizationtext-to-speech | —Unverified | 0 | 0 |
| TTS-Guided Training for Accent Conversion Without Parallel Data | Dec 20, 2022 | Decodertext-to-speech | —Unverified | 0 | 0 |
| TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations | Jul 2, 2024 | Benchmarkingtext-to-speech | —Unverified | 0 | 0 |
| TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer | Jan 10, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| UmbraTTS: Adapting Text-to-Speech to Environmental Contexts with Flow Matching | Jun 11, 2025 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |