Meta Learning Text-to-Speech Synthesis in over 7000 Languages Jun 10, 2024 Meta-Learning Speech Synthesis
Code Code Available 05 Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers Sep 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 MelNet: A Generative Model for Audio in the Frequency Domain Jun 4, 2019 Audio Generation Music Generation
Code Code Available 05 Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis Apr 26, 2021 Language Modeling Language Modelling
Code Code Available 05 Creating New Language and Voice Components for the Updated MaryTTS Text-to-Speech Synthesis Platform Dec 13, 2017 Speech Synthesis text-to-speech
— Unverified 00 Controllable Prosody Generation With Partial Inputs Mar 14, 2023 Speech Synthesis text-to-speech
— Unverified 00 A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis Nov 11, 2019 Polyphone disambiguation Speech Synthesis
— Unverified 00 Controllable neural text-to-speech synthesis using intuitive prosodic features Sep 14, 2020 Sentence Speech Synthesis
— Unverified 00 Controllable Accented Text-to-Speech Synthesis Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 00 A unified front-end framework for English text-to-speech synthesis May 18, 2023 Speech Synthesis Text Normalization
— Unverified 00 An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Oct 6, 2022 Speech Synthesis text-to-speech
— Unverified 00 Continual Speaker Adaptation for Text-to-Speech Synthesis Mar 26, 2021 Continual Learning Diversity
— Unverified 00 Full-text Error Correction for Chinese Speech Recognition with Large Language Model Sep 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00 FMSD-TTS: Few-shot Multi-Speaker Multi-Dialect Text-to-Speech Synthesis for Ü-Tsang, Amdo and Kham Speech Dataset Generation May 20, 2025 Dataset Generation Speech Synthesis
— Unverified 00 Conditioning Sequence-to-sequence Networks with Learned Activations Sep 29, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00 A Unified Framework for Collecting Text-to-Speech Synthesis Datasets for 22 Indian Languages Oct 18, 2024 Speech Synthesis text-to-speech
— Unverified 00 FLY-TTS: Fast, Lightweight and High-Quality End-to-End Text-to-Speech Synthesis Jun 30, 2024 CPU Decoder
— Unverified 00 Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis Mar 14, 2019 Generative Adversarial Network Speech Synthesis
— Unverified 00 Flavored Tacotron: Conditional Learning for Prosodic-linguistic Features Apr 8, 2021 Decoder Speech Synthesis
— Unverified 00 Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement Nov 8, 2020 Disentanglement Speech Synthesis
— Unverified 00 Generative Pre-training for Speech with Flow Matching Oct 25, 2023 Speech Enhancement Speech Synthesis
— Unverified 00 Generative Semantic Communication for Text-to-Speech Synthesis Oct 4, 2024 Quantization Semantic Communication
— Unverified 00 Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech Jul 31, 2023 Acoustic Modelling Speech Synthesis
— Unverified 00 Augmenting Images for ASR and TTS through Single-loop and Dual-loop Multimodal Chain Framework Nov 4, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00 A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions Jun 4, 2025 Data Augmentation Diversity
— Unverified 00 An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems Oct 21, 2020 Grapheme-to-Phoneme Conversion Relation
— Unverified 00 A distributed cloud-based dialog system for conversational application development Sep 1, 2015 Speech Recognition Speech Synthesis
— Unverified 00 Code-Mixed Text to Speech Synthesis under Low-Resource Constraints Dec 2, 2023 Speech Synthesis text-to-speech
— Unverified 00 Fast Bootstrapping of Grapheme to Phoneme System for Under-resourced Languages - Application to the Iban Language Oct 1, 2013 Speech Recognition Speech Synthesis
— Unverified 00 Chain-of-Thought Training for Open E2E Spoken Dialogue Systems May 31, 2025 Language Modeling Language Modelling
— Unverified 00 Exploring Transfer Learning for Urdu Speech Synthesis Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 00 CASSANDRA: A multipurpose configurable voice-enabled human-computer-interface Apr 1, 2017 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00 A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI Mar 23, 2023 Speech Enhancement Speech Synthesis
— Unverified 00 An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis Jun 3, 2021 Speaker Verification Speech Synthesis
— Unverified 00 Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model May 16, 2024 Hallucination Language Modeling
— Unverified 00 CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech Jun 3, 2025 Speech Synthesis text-to-speech
— Unverified 00 Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs Sep 9, 2019 Form Speech Synthesis
— Unverified 00 BU-TTS: An Open-Source, Bilingual Welsh-English, Text-to-Speech Corpus Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 00 AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms Nov 9, 2018 GPU Image Captioning
— Unverified 00 EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 00 Environment Aware Text-to-Speech Synthesis Oct 8, 2021 Attribute Disentanglement
— Unverified 00 Building Text-to-Speech Systems for Resource Poor Languages May 1, 2012 Clustering Speech Synthesis
— Unverified 00 Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback Jun 2, 2024 Speech Synthesis text-to-speech
— Unverified 00 Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis May 1, 2012 Audio-Visual Speech Recognition Speech Recognition
— Unverified 00 An In-depth Analysis of the Effect of Text Normalization in Social Media May 1, 2015 Dependency Parsing named-entity-recognition
— Unverified 00 Adaptive Parser-Centric Text Normalization Aug 1, 2013 Machine Translation Speech Recognition
— Unverified 00 Accented Text-to-Speech Synthesis with Limited Data May 8, 2023 Speech Synthesis text-to-speech
— Unverified 00 End-to-End Text-to-Speech using Latent Duration based on VQ-VAE Oct 19, 2020 Speech Synthesis text-to-speech
— Unverified 00 BUCEADOR, a multi-language search engine for digital libraries May 1, 2012 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00 End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator Oct 31, 2018 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00