A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis Mar 22, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data Mar 22, 2022 speech-recognition Speech Recognition
— Unverified 0AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling Mar 21, 2022 Decoder Speech Synthesis
— Unverified 0ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis Mar 20, 2022 Speaker Verification Speech Synthesis
Code Code Available 0AdaVocoder: Adaptive Vocoder for Custom Voice Mar 18, 2022 Speech Synthesis Transfer Learning
— Unverified 0Robotic Speech Synthesis: Perspectives on Interactions, Scenarios, and Ethics Mar 17, 2022 Ethics Speech Synthesis
— Unverified 0Whither the Priors for (Vocal) Interactivity? Mar 16, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Text-free non-parallel many-to-many voice conversion using normalising flows Mar 15, 2022 Normalising Flows Speech Synthesis
— Unverified 0Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis Mar 2, 2022 Speech Synthesis
— Unverified 0Improving Cross-lingual Speech Synthesis with Triplet Training Scheme Feb 22, 2022 Speech Synthesis text-to-speech
— Unverified 0VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion Feb 18, 2022 Quantization Speech Synthesis
— Unverified 0Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module Feb 16, 2022 Speech Synthesis text-to-speech
— Unverified 0Unsupervised word-level prosody tagging for controllable speech synthesis Feb 15, 2022 Speech Synthesis text-to-speech
— Unverified 0Partially Fake Audio Detection by Self-attention-based Fake Span Discovery Feb 14, 2022 Open-Ended Question Answering Question Answering
— Unverified 0Deep Performer: Score-to-Audio Music Performance Synthesis Feb 12, 2022 Decoder Speech Synthesis
— Unverified 0Transformer-based Models of Text Normalization for Speech Applications Feb 1, 2022 Sentence Speech Synthesis
— Unverified 0DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs Jan 28, 2022 Denoising Speech Synthesis
— Unverified 0Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention Jan 25, 2022 Form Speech Synthesis
— Unverified 0Towards a Real-time Measure of the Perception of Anthropomorphism in Human-robot Interaction Jan 24, 2022 Speech Synthesis
Code Code Available 0Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training Jan 20, 2022 Multi-Task Learning Speech Synthesis
— Unverified 0Deep Speech Synthesis from Articulatory Features Jan 16, 2022 Speech Synthesis
— Unverified 0A Practical Guide to Logical Access Voice Presentation Attack Detection Jan 10, 2022 Artifact Detection Speaker Verification
— Unverified 0Quasi-Taylor Samplers for Diffusion Generative Models based on Ideal Derivatives Dec 26, 2021 Denoising Image Generation
— Unverified 0Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios Dec 23, 2021 Diversity Speech Synthesis
— Unverified 0整合語者嵌入向量與後置濾波器於提升個人化合成語音之語者相似度 (Incorporating Speaker Embedding and Post-Filter Network for Improving Speaker Similarity of Personalized Speech Synthesis System) Dec 1, 2021 Speech Synthesis
— Unverified 0Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance Nov 23, 2021 speech-recognition Speech Recognition
— Unverified 0Word-Level Style Control for Expressive, Non-attentive Speech Synthesis Nov 19, 2021 Expressive Speech Synthesis Speech Synthesis
— Unverified 0Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis Nov 19, 2021 Clustering Decoder
— Unverified 0High Quality Streaming Speech Synthesis with Low, Sentence-Length-Independent Latency Nov 17, 2021 CPU Decoder
— Unverified 0Cross-lingual Low Resource Speaker Adaptation Using Phonological Features Nov 17, 2021 Speech Synthesis
— Unverified 0Speech Synthesis for Low Resource Languages using Transliteration Enabled Transfer Learning Nov 16, 2021 speech-recognition Speech Recognition
— Unverified 0Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no access to speech data Nov 16, 2021 speech-recognition Speech Recognition
— Unverified 0Improving Prosody for Unseen Texts in Speech Synthesis by Utilizing Linguistic Information and Noisy Data Nov 15, 2021 Chinese Word Segmentation Multi-Task Learning
— Unverified 0Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity Nov 2, 2021 Cross-Lingual Transfer speech-recognition
Code Code Available 0fairseq Sˆ2: A Scalable and Integrable Speech Synthesis Toolkit Nov 1, 2021 Speech Synthesis text-to-speech
— Unverified 0Assessing Evaluation Metrics for Speech-to-Speech Translation Oct 26, 2021 Machine Translation Open-Ended Question Answering
— Unverified 0DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021 Oct 25, 2021 Speech Synthesis text-to-speech
Code Code Available 0Synt++: Utilizing Imperfect Synthetic Data to Improve Speech Recognition Oct 21, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0CaloFlow II: Even Faster and Still Accurate Generation of Calorimeter Showers with Normalizing Flows Oct 21, 2021 Speech Synthesis
Code Code Available 0From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation Oct 15, 2021 Data Augmentation Simultaneous Speech-to-Speech Translation
— Unverified 0Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention Oct 15, 2021 Simultaneous Speech-to-Speech Translation Speech Synthesis
— Unverified 0SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation Oct 14, 2021 Generative Adversarial Network GPU
— Unverified 0DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding Oct 13, 2021 Speech Synthesis Voice Conversion
— Unverified 0Systematic Inequalities in Language Technology Performance across the World's Languages Oct 13, 2021 Dependency Parsing Machine Translation
Code Code Available 0LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example Oct 11, 2021 Speech Synthesis
— Unverified 0Using multiple reference audios and style embedding constraints for speech synthesis Oct 9, 2021 Sentence Sentence Similarity
— Unverified 0Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis Oct 9, 2021 Lifelong learning Speech Synthesis
Code Code Available 0Environment Aware Text-to-Speech Synthesis Oct 8, 2021 Attribute Disentanglement
— Unverified 0VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over Oct 7, 2021 Speech Synthesis text-to-speech
— Unverified 0Cloning one's voice using very limited data in the wild Oct 7, 2021 Speech Synthesis
— Unverified 0