Direct speech-to-speech translation with a sequence-to-sequence model Apr 12, 2019 Speech Synthesis Speech-to-Speech Translation
Code Code Available 05 Neural Autoregressive Flows Apr 3, 2018 Density Estimation Speech Synthesis
Code Code Available 05 Multimodal Latent Language Modeling with Next-Token Diffusion Dec 11, 2024 Image Generation Language Modeling
Code Code Available 05 Diff-TTS: A Denoising Diffusion Model for Text-to-Speech Apr 3, 2021 Denoising GPU
Code Code Available 05 A Variational Prosody Model for the decomposition and synthesis of speech prosody Jun 22, 2018 Speech Synthesis
Code Code Available 05 Multi-view Temporal Alignment for Non-parallel Articulatory-to-Acoustic Speech Synthesis Dec 30, 2020 Dynamic Time Warping MULTI-VIEW LEARNING
Code Code Available 05 DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs Jan 28, 2022 Denoising Speech Synthesis
Code Code Available 05 DialogueAgents: A Hybrid Agent-Based Speech Synthesis Framework for Multi-Party Dialogue Apr 20, 2025 Diversity Speech Synthesis
Code Code Available 05 Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq May 25, 2018 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 MelNet: A Generative Model for Audio in the Frequency Domain Jun 4, 2019 Audio Generation Music Generation
Code Code Available 05 Meta Learning Text-to-Speech Synthesis in over 7000 Languages Jun 10, 2024 Meta-Learning Speech Synthesis
Code Code Available 05 Maximizing Mutual Information for Tacotron Aug 30, 2019 Attribute Speech Synthesis
Code Code Available 05 Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers Sep 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis Oct 23, 2019 Form Speech Synthesis
Code Code Available 05 Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM May 24, 2023 Language Modelling Question Answering
Code Code Available 05 DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021 Oct 25, 2021 Speech Synthesis text-to-speech
Code Code Available 05 Deep Voice: Real-time Neural Text-to-Speech Feb 25, 2017 Audio Synthesis Boundary Detection
Code Code Available 05 Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning Oct 20, 2017 GPU Speech Synthesis
Code Code Available 05 Deep Voice 2: Multi-Speaker Neural Text-to-Speech May 24, 2017 Speech Synthesis text-to-speech
Code Code Available 05 Language Technology Programme for Icelandic 2019-2023 Mar 20, 2020 Machine Translation speech-recognition
Code Code Available 05 DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis Dec 9, 2020 Speaker Recognition Speech Synthesis
Code Code Available 05 JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis Oct 28, 2017 BIG-bench Machine Learning Speech Synthesis
Code Code Available 05 Learning latent representations for style control and transfer in end-to-end speech synthesis Dec 11, 2018 Speech Synthesis Style Transfer
Code Code Available 05 Deep Residual Neural Networks for Audio Spoofing Detection Jun 30, 2019 Speaker Verification Speech Synthesis
Code Code Available 05 Jejueo Datasets for Machine Translation and Speech Synthesis Nov 27, 2019 Machine Translation Speech Synthesis
Code Code Available 05 Integrated Speech and Gesture Synthesis Aug 25, 2021 Speech Synthesis text-to-speech
Code Code Available 05 Intra- and Inter-modal Context Interaction Modeling for Conversational Speech Synthesis Dec 25, 2024 Contrastive Learning Speech Synthesis
Code Code Available 05 Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Oct 29, 2018 Speech Synthesis text-to-speech
Code Code Available 05 JSSS: free Japanese speech corpus for summarization and simplification Oct 5, 2020 Form Speech Synthesis
Code Code Available 05 DeepGesture: A conversational gesture synthesis system based on emotions and semantics Jul 3, 2025 Gesture Generation Motion Synthesis
Code Code Available 05 Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network Jan 31, 2020 Quantization Speech Synthesis
Code Code Available 05 Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms May 18, 2023 Speech Synthesis
Code Code Available 05 High Fidelity Speech Synthesis with Adversarial Networks Sep 25, 2019 Generative Adversarial Network Speech Synthesis
Code Code Available 05 Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation Mar 31, 2024 Language Modeling Language Modelling
Code Code Available 05 Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis Nov 12, 2020 Speech Synthesis text-to-speech
Code Code Available 05 Improving Self-Supervised Learning-based MOS Prediction Networks Apr 23, 2022 Prediction Quantization
Code Code Available 05 Half-Truth: A Partially Fake Audio Detection Dataset Apr 8, 2021 Speech Synthesis
Code Code Available 05 Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition Aug 17, 2024 Language Modeling Language Modelling
Code Code Available 05 Handling Background Noise in Neural Speech Generation Feb 23, 2021 Denoising Speech Synthesis
Code Code Available 05 GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram Apr 8, 2019 Speech Synthesis text-to-speech
Code Code Available 05 Hierarchical Generative Modeling for Controllable Speech Synthesis Oct 16, 2018 Attribute Speech Synthesis
Code Code Available 05 Independent and automatic evaluation of acoustic-to-articulatory inversion models Nov 15, 2019 speech-recognition Speech Recognition
Code Code Available 05 Learning pronunciation from a foreign language in speech synthesis networks Oct 22, 2018 Speech Synthesis
Code Code Available 05 Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity Nov 2, 2021 Cross-Lingual Transfer speech-recognition
Code Code Available 05 FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis Jul 8, 2022 Lip to Speech Synthesis Speech Synthesis
Code Code Available 05 fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit Sep 14, 2021 Speech Synthesis text-to-speech
Code Code Available 05 Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging Jul 12, 2021 Prediction Speech Synthesis
Code Code Available 05 fairseq Sˆ2: A Scalable and Integrable Speech Synthesis Toolkit Nov 1, 2021 Speech Synthesis text-to-speech
Code Code Available 05 Audio Codec Augmentation for Robust Collaborative Watermarking of Speech Synthesis Sep 20, 2024 Face Swapping Speech Synthesis
Code Code Available 05 Evaluating context-invariance in unsupervised speech representations Oct 27, 2022 Language Modelling speech-recognition
Code Code Available 05