Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer Sep 3, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios Dec 23, 2021 Diversity Speech Synthesis
— Unverified 0Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes Aug 7, 2020 Gaussian Processes Speech Synthesis
— Unverified 0Multi-Stage Deep Transfer Learning for EmIoT-enabled Human-Computer Interaction Feb 3, 2022 Human-Object Interaction Detection text-to-speech
— Unverified 0Multi-step Natural Language Understanding Aug 1, 2013 Natural Language Understanding Speech Recognition
— Unverified 0Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models Nov 17, 2022 Speech Synthesis text-to-speech
— Unverified 0An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Oct 6, 2022 Speech Synthesis text-to-speech
— Unverified 0Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis Aug 27, 2019 Speech Synthesis text-to-speech
— Unverified 0Neural Models of Text Normalization for Speech Applications Jun 1, 2019 BIG-bench Machine Learning Speech Synthesis
— Unverified 0Neural Speech Synthesis in German Oct 3, 2021 Speech Synthesis text-to-speech
— Unverified 0A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions Jun 4, 2025 Data Augmentation Diversity
— Unverified 0Neural Text Normalization with Subword Units Jun 1, 2019 Machine Translation Natural Language Understanding
— Unverified 0Neural Text-to-Speech Synthesis for an Under-Resourced Language in a Diglossic Environment: the Case of Gascon Occitan May 1, 2020 Speech Synthesis text-to-speech
— Unverified 0Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters Jan 10, 2024 Self-Supervised Learning Speech Enhancement
— Unverified 0Normalization of Lithuanian Text Using Regular Expressions Dec 29, 2023 Speech Synthesis Text Normalization
— Unverified 0Normalization of Non-Standard Words in Croatian Texts Mar 27, 2015 Form General Classification
— Unverified 0Normalizing Text using Language Modelling based on Phonetics and String Similarity Jun 25, 2020 Language Modeling Language Modelling
— Unverified 0Open-Source Boundary-Annotated Corpus for Arabic Speech and Language Processing May 1, 2012 Chunking Descriptive
— Unverified 0VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature Apr 2, 2022 Speech Synthesis text-to-speech
— Unverified 0An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis Jun 3, 2021 Speaker Verification Speech Synthesis
— Unverified 0An In-depth Analysis of the Effect of Text Normalization in Social Media May 1, 2015 Dependency Parsing named-entity-recognition
— Unverified 0Parallel WaveNet conditioned on VAE latent vectors Dec 17, 2020 Sentence Speech Synthesis
— Unverified 0ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations Mar 1, 2023 Self-Supervised Learning Speech Synthesis
— Unverified 0Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis Jun 4, 2024 In-Context Learning Language Modeling
— Unverified 0PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS Mar 28, 2021 Representation Learning Text-To-Speech Synthesis
— Unverified 0An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis Dec 8, 2023 Benchmarking Quantization
— Unverified 0Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis Aug 4, 2018 Speech Synthesis text-to-speech
— Unverified 0Predicting Romanian Stress Assignment Apr 1, 2014 Speech Synthesis Text-To-Speech Synthesis
— Unverified 0Probing Speaker-specific Features in Speaker Representations Jan 9, 2025 Self-Supervised Learning Speaker Verification
— Unverified 0A Multi-Agent Framework for Automated Qinqiang Opera Script Generation Using Large Language Models Apr 22, 2025 cross-modal alignment Script Generation
— Unverified 0PROEMO: Prompt-Driven Text-to-Speech Synthesis Based on Emotion and Intensity Control Jan 10, 2025 Speech Synthesis text-to-speech
— Unverified 0PSCodec: A Series of High-Fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders Apr 3, 2024 Representation Learning Speaker Verification
— Unverified 0ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis Dec 16, 2024 Speech Synthesis text-to-speech
— Unverified 0Prosody-TTS: An end-to-end speech synthesis system with prosody control Oct 6, 2021 Rhythm Speech Synthesis
— Unverified 0Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis Apr 14, 2025 Language Modeling Language Modelling
— Unverified 0Punjabi Text-To-Speech Synthesis System Dec 1, 2012 Speech Synthesis text-to-speech
— Unverified 0Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder Jul 31, 2018 Generative Adversarial Network Speech Synthesis
— Unverified 0Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks Oct 30, 2018 Image Generation Speech Synthesis
— Unverified 0RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis Apr 4, 2024 Language Modeling Language Modelling
— Unverified 0Real-time Incremental Speech-to-Speech Translation of Dialogs Jun 1, 2012 Machine Translation Speech Recognition
— Unverified 0ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence May 9, 2022 Speech Synthesis text-to-speech
— Unverified 0Refer-iTTS: A System for Referring in Spoken Installments to Objects in Real-World Images Sep 1, 2017 Referring Expression Referring expression generation
— Unverified 0Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability Apr 3, 2021 Emotion Recognition reinforcement-learning
— Unverified 0DLPO: Diffusion Model Loss-Guided Reinforcement Learning for Fine-Tuning Text-to-Speech Diffusion Models May 23, 2024 Image Generation reinforcement-learning
— Unverified 0ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement Dec 21, 2022 Audio-Visual Speech Recognition Resynthesis
— Unverified 0ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration Jan 1, 2023 Audio-Visual Speech Recognition Resynthesis
— Unverified 0Revival with Voice: Multi-modal Controllable Text-to-Speech Synthesis May 25, 2025 Speech Synthesis text-to-speech
— Unverified 0R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS Jun 30, 2022 Decoder GPU
— Unverified 0Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization Jul 2, 2024 Inference Optimization Speech Synthesis
— Unverified 0RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus May 1, 2014 Speech Synthesis text-to-speech
— Unverified 0