DiffWave: A Versatile Diffusion Model for Audio Synthesis Sep 21, 2020 Audio Synthesis Diversity
Code Code Available 1WaveGrad: Estimating Gradients for Waveform Generation Sep 2, 2020 Speech Synthesis Text-To-Speech Synthesis
Code Code Available 1Dynamical Variational Autoencoders: A Comprehensive Review Aug 28, 2020 3D Human Dynamics Resynthesis
Code Code Available 1Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning Aug 20, 2020 Expressive Speech Synthesis Speech Synthesis
Code Code Available 1Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion Aug 13, 2020 Speech Synthesis text-to-speech
Code Code Available 1Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding Aug 12, 2020 Speech Synthesis text-to-speech
Code Code Available 1Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions Aug 9, 2020 Speech Synthesis text-to-speech
Code Code Available 1SpeedySpeech: Efficient Neural Speech Synthesis Aug 9, 2020 Audio Synthesis CPU
Code Code Available 1Phonological Features for 0-shot Multilingual Speech Synthesis Aug 6, 2020 Speech Synthesis text-to-speech
Code Code Available 1A Spectral Energy Distance for Parallel Speech Synthesis Aug 3, 2020 scoring rule Speech Synthesis
Code Code Available 1One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech Aug 3, 2020 Meta-Learning Speech Synthesis
Code Code Available 1VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network Jul 30, 2020 CPU GPU
Code Code Available 1NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity Jun 11, 2020 Density Estimation Normalising Flows
Code Code Available 1WaveNODE: A Continuous Normalizing Flow for Speech Synthesis Jun 8, 2020 Speech Synthesis
Code Code Available 1FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Jun 8, 2020 Knowledge Distillation Speech Synthesis
Code Code Available 1End-to-End Adversarial Text-to-Speech Jun 5, 2020 Adversarial Text Dynamic Time Warping
Code Code Available 1PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives Jun 2, 2020 speech-recognition Speech Recognition
Code Code Available 1Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis May 17, 2020 Lip Reading Lip to Speech Synthesis
Code Code Available 1Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis May 12, 2020 Speech Synthesis Style Transfer
Code Code Available 1TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese May 11, 2020 Denoising Speech Synthesis
Code Code Available 1From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint May 10, 2020 Speaker Verification Speech Synthesis
Code Code Available 1Can Speaker Augmentation Improve Multi-Speaker End-to-End TTS? May 4, 2020 Speech Synthesis
Code Code Available 1Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0 Mar 14, 2020 Clustering Representation Learning
Code Code Available 1A Neuro-AI Interface for Evaluating Generative Adversarial Networks Mar 5, 2020 Speech Synthesis
Code Code Available 1A Resource for Computational Experiments on Mapudungun Dec 4, 2019 Machine Translation speech-recognition
Code Code Available 1WaveFlow: A Compact Flow-based Model for Raw Audio Dec 3, 2019 GPU Speech Synthesis
Code Code Available 1MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis Oct 8, 2019 CPU GPU
Code Code Available 1Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis Jun 8, 2019 Expressive Speech Synthesis Speech Synthesis
Code Code Available 1Synthetic-Neuroscore: Using A Neuro-AI Interface for Evaluating Generative Adversarial Networks May 10, 2019 Image Generation Speech Synthesis
Code Code Available 1In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data Apr 4, 2019 Speech Synthesis text-to-speech
Code Code Available 1Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis Mar 27, 2019 Emotional Speech Synthesis Expressive Speech Synthesis
Code Code Available 1Exploring Transfer Learning for Low Resource Emotional TTS Jan 14, 2019 Deep Learning Emotional Speech Synthesis
Code Code Available 1Learning pronunciation from a foreign language in speech synthesis networks Nov 23, 2018 Speech Synthesis
Code Code Available 1ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech Jul 19, 2018 Speech Synthesis text-to-speech
Code Code Available 1FonBund: A Library for Combining Cross-lingual Phonological Segment Data May 1, 2018 Language Modeling Language Modelling
Code Code Available 1Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron Mar 24, 2018 Expressive Speech Synthesis Speech Synthesis
Code Code Available 1Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis Mar 23, 2018 Speech Synthesis Style Transfer
Code Code Available 1Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Dec 16, 2017 Speech Synthesis
Code Code Available 1Tacotron: Towards End-to-End Speech Synthesis Mar 29, 2017 Audio Synthesis Speech Synthesis
Code Code Available 1WaveNet: A Generative Model for Raw Audio Sep 12, 2016 Audio Generation model
Code Code Available 1NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech Jul 17, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis Jul 8, 2025 Data Augmentation Mixture-of-Experts
— Unverified 0A Hybrid Machine Learning Framework for Optimizing Crop Selection via Agronomic and Economic Forecasting Jul 6, 2025 Hybrid Machine Learning speech-recognition
— Unverified 0DeepGesture: A conversational gesture synthesis system based on emotions and semantics Jul 3, 2025 Gesture Generation Motion Synthesis
Code Code Available 0OpusLM: A Family of Open Unified Speech Language Models Jun 21, 2025 Decoder speech-recognition
— Unverified 0An accurate and revised version of optical character recognition-based speech synthesis using LabVIEW Jun 18, 2025 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning Models Jun 17, 2025 Kolmogorov-Arnold Networks Self-Supervised Learning
Code Code Available 0From Flat to Feeling: A Feasibility and Impact Study on Dynamic Facial Emotions in AI-Generated Avatars Jun 16, 2025 GPU Speech Synthesis
— Unverified 0S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation Jun 11, 2025 Reading Comprehension Speech Synthesis
— Unverified 0UmbraTTS: Adapting Text-to-Speech to Environmental Contexts with Flow Matching Jun 11, 2025 Speech Synthesis text-to-speech
— Unverified 0