Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis May 29, 2023 Speech Synthesis text-to-speech
— Unverified 0Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models May 27, 2023 Speech Synthesis Voice Conversion
— Unverified 0Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM May 24, 2023 Language Modelling Question Answering
Code Code Available 0ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings May 23, 2023 Chatbot Reading Comprehension
— Unverified 0CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center May 23, 2023 Speech Synthesis
— Unverified 0ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models May 23, 2023 Speech Synthesis text-to-speech
— Unverified 0Text Generation with Speech Synthesis for ASR Data Augmentation May 22, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages May 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting May 19, 2023 Speech Synthesis text-to-speech
— Unverified 0A unified front-end framework for English text-to-speech synthesis May 18, 2023 Speech Synthesis Text Normalization
— Unverified 0Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms May 18, 2023 Speech Synthesis
Code Code Available 0Empirical Analysis of Oral and Nasal Vowels of Konkani May 17, 2023 Speech Synthesis
— Unverified 0Zero-shot personalized lip-to-speech synthesis with face image based voice control May 9, 2023 Lip to Speech Synthesis Representation Learning
— Unverified 0Accented Text-to-Speech Synthesis with Limited Data May 8, 2023 Speech Synthesis text-to-speech
— Unverified 0M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis May 3, 2023 Speech Synthesis text-to-speech
— Unverified 0A Review of Deep Learning Techniques for Speech Processing Apr 30, 2023 Automatic Speech Recognition Deep Learning
— Unverified 0Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model Apr 24, 2023 Rhythm Self-Supervised Learning
— Unverified 0Ensemble prosody prediction for expressive speech synthesis Apr 3, 2023 Diversity Ensemble Learning
— Unverified 0Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis Mar 27, 2023 All Automatic Speech Recognition
— Unverified 0Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis Mar 24, 2023 Generative Adversarial Network Speech Synthesis
— Unverified 0A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI Mar 23, 2023 Speech Enhancement Speech Synthesis
— Unverified 0Transformers in Speech Processing: A Survey Mar 21, 2023 Automatic Speech Recognition Speech Enhancement
— Unverified 0QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis Mar 14, 2023 Emotional Speech Synthesis Sentence
— Unverified 0Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis Mar 14, 2023 Prosody Prediction Speech Synthesis
— Unverified 0Controllable Prosody Generation With Partial Inputs Mar 14, 2023 Speech Synthesis text-to-speech
— Unverified 0VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation Mar 14, 2023 Disentanglement Speech Synthesis
— Unverified 0Do Prosody Transfer Models Transfer Prosody? Mar 7, 2023 Speech Synthesis text-to-speech
— Unverified 0FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model Mar 6, 2023 Language Modeling Language Modelling
— Unverified 0DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation Detection and Correction Mar 1, 2023 Dynamic Time Warping Metric Learning
— Unverified 0ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations Mar 1, 2023 Self-Supervised Learning Speech Synthesis
— Unverified 0On the Audio-visual Synchronization for Lip-to-Speech Synthesis Mar 1, 2023 Audio-Visual Synchronization Lip to Speech Synthesis
— Unverified 0UniFLG: Unified Facial Landmark Generator from Text or Speech Feb 28, 2023 Decoder Face Generation
— Unverified 0ClArTTS: An Open-Source Classical Arabic Text-to-Speech Corpus Feb 28, 2023 Speech Synthesis text-to-speech
— Unverified 0CrossSpeech: Speaker-independent Acoustic Representation for Cross-lingual Speech Synthesis Feb 28, 2023 Speech Synthesis text-to-speech
— Unverified 0Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages Feb 13, 2023 Speech Synthesis text-to-speech
— Unverified 0Beyond Statistical Similarity: Rethinking Metrics for Deep Generative Models in Engineering Design Feb 6, 2023 Drug Discovery Learning Theory
— Unverified 0UzbekTagger: The rule-based POS tagger for Uzbek language Jan 30, 2023 Language Modeling Language Modelling
— Unverified 0Time out of Mind: Generating Rate of Speech conditioned on emotion and speaker Jan 29, 2023 Speech Synthesis text-to-speech
Code Code Available 0On granularity of prosodic representations in expressive text-to-speech Jan 26, 2023 Expressive Speech Synthesis Speech Synthesis
— Unverified 0Multilingual Multiaccented Multispeaker TTS with RADTTS Jan 24, 2023 Speech Synthesis
— Unverified 0Regeneration Learning: A Learning Paradigm for Data Generation Jan 21, 2023 Image Generation Representation Learning
— Unverified 0Applying Automated Machine Translation to Educational Video Courses Jan 9, 2023 Machine Translation Speech Synthesis
— Unverified 0ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration Jan 1, 2023 Audio-Visual Speech Recognition Resynthesis
— Unverified 0HMM-based data augmentation for E2E systems for building conversational speech synthesis systems Dec 22, 2022 Data Augmentation Language Modeling
— Unverified 0ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement Dec 21, 2022 Audio-Visual Speech Recognition Resynthesis
— Unverified 0Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language Dec 16, 2022 Language Modeling Language Modelling
— Unverified 0Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder Dec 16, 2022 Representation Learning Speech Synthesis
— Unverified 0Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis Dec 13, 2022 Data Augmentation Speech Synthesis
— Unverified 0SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech Nov 30, 2022 Speech Synthesis text-to-speech
— Unverified 0VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing Nov 30, 2022 Machine Translation Sentence
— Unverified 0