Speech waveform synthesis from MFCC sequences with generative adversarial networks Apr 3, 2018 Generative Adversarial Network Speech Synthesis
Code Code Available 05 Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis Feb 28, 2020 Speech Synthesis text-to-speech
Code Code Available 05 Robust and fine-grained prosody control of end-to-end speech synthesis Nov 6, 2018 Expressive Speech Synthesis Speech Synthesis
Code Code Available 05 Recurrent Quantum Neural Networks Jun 25, 2020 Benchmarking BIG-bench Machine Learning
Code Code Available 05 Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts May 10, 2022 Speech Synthesis Voice Conversion
Code Code Available 05 PromptTTS: Controllable Text-to-Speech with Text Descriptions Nov 22, 2022 Decoder Speech Synthesis
Code Code Available 05 Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning Models Jun 17, 2025 Kolmogorov-Arnold Networks Self-Supervised Learning
Code Code Available 05 Probing the Feasibility of Multilingual Speaker Anonymization Jul 3, 2024 Speaker anonymization Speech Synthesis
Code Code Available 05 PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior Jun 11, 2021 Audio Generation Denoising
Code Code Available 05 RawNet: Fast End-to-End Neural Vocoder Apr 10, 2019 Speech Synthesis
Code Code Available 05 Articulatory Feature Prediction from Surface EMG during Speech Production May 20, 2025 Electromyography (EMG) Speech Synthesis
Code Code Available 05 Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis Apr 26, 2021 Language Modeling Language Modelling
Code Code Available 05 Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting Oct 8, 2023 Prediction Speech Synthesis
Code Code Available 05 Parallel WaveNet: Fast High-Fidelity Speech Synthesis Nov 28, 2017 Speech Synthesis Vocal Bursts Intensity Prediction
Code Code Available 05 ChatGPT in the context of precision agriculture data analytics Nov 10, 2023 Language Modelling speech-recognition
Code Code Available 05 One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection Jun 24, 2024 Audio Deepfake Detection DeepFake Detection
Code Code Available 05 OmniDRCA: Parallel Speech-Text Foundation Model via Dual-Resolution Speech Representations and Contrastive Alignment Jun 11, 2025 cross-modal alignment Question Answering
Code Code Available 05 Neural Voice Cloning with a Few Samples Feb 14, 2018 Speech Synthesis Voice Cloning
Code Code Available 05 NVC-Net: End-to-End Adversarial Voice Conversion Jun 2, 2021 GPU Speech Synthesis
Code Code Available 05 Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish May 31, 2022 Machine Translation Speech Synthesis
Code Code Available 05 Can large-scale vocoded spoofed data improve speech spoofing countermeasure with a self-supervised front end? Sep 12, 2023 Self-Supervised Learning Speech Synthesis
Code Code Available 05 CaloFlow II: Even Faster and Still Accurate Generation of Calorimeter Showers with Normalizing Flows Oct 21, 2021 Speech Synthesis
Code Code Available 05 Neural Autoregressive Flows Apr 3, 2018 Density Estimation Speech Synthesis
Code Code Available 05 A Practical Guide to Logical Access Voice Presentation Attack Detection Jan 10, 2022 Artifact Detection Speaker Verification
Code Code Available 05 Multimodal Latent Language Modeling with Next-Token Diffusion Dec 11, 2024 Image Generation Language Modeling
Code Code Available 05 A Fast and Accurate Pitch Estimation Algorithm Based on the Pseudo Wigner-Ville Distribution Oct 27, 2022 Speech Synthesis
Code Code Available 05 Multi-view Temporal Alignment for Non-parallel Articulatory-to-Acoustic Speech Synthesis Dec 30, 2020 Dynamic Time Warping MULTI-VIEW LEARNING
Code Code Available 05 Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS Aug 3, 2023 Denoising Speech Synthesis
Code Code Available 05 Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq May 25, 2018 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers Sep 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis Oct 23, 2019 Form Speech Synthesis
Code Code Available 05 Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation Dec 8, 2020 Attribute Disentanglement
Code Code Available 05 Maximizing Mutual Information for Tacotron Aug 30, 2019 Attribute Speech Synthesis
Code Code Available 05 Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM May 24, 2023 Language Modelling Question Answering
Code Code Available 05 Learning pronunciation from a foreign language in speech synthesis networks Oct 22, 2018 Speech Synthesis
Code Code Available 05 Learning latent representations for style control and transfer in end-to-end speech synthesis Dec 11, 2018 Speech Synthesis Style Transfer
Code Code Available 05 Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting Feb 19, 2024 Language Modeling Language Modelling
Code Code Available 05 Language Technology Programme for Icelandic 2019-2023 Mar 20, 2020 Machine Translation speech-recognition
Code Code Available 05 MelNet: A Generative Model for Audio in the Frequency Domain Jun 4, 2019 Audio Generation Music Generation
Code Code Available 05 Jejueo Datasets for Machine Translation and Speech Synthesis Nov 27, 2019 Machine Translation Speech Synthesis
Code Code Available 05 JSSS: free Japanese speech corpus for summarization and simplification Oct 5, 2020 Form Speech Synthesis
Code Code Available 05 Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Oct 29, 2018 Speech Synthesis text-to-speech
Code Code Available 05 Intra- and Inter-modal Context Interaction Modeling for Conversational Speech Synthesis Dec 25, 2024 Contrastive Learning Speech Synthesis
Code Code Available 05 JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis Oct 28, 2017 BIG-bench Machine Learning Speech Synthesis
Code Code Available 05 A Variational Prosody Model for the decomposition and synthesis of speech prosody Jun 22, 2018 Speech Synthesis
Code Code Available 05 Integrated Speech and Gesture Synthesis Aug 25, 2021 Speech Synthesis text-to-speech
Code Code Available 05 Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network Jan 31, 2020 Quantization Speech Synthesis
Code Code Available 05 Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms May 18, 2023 Speech Synthesis
Code Code Available 05 Improving Self-Supervised Learning-based MOS Prediction Networks Apr 23, 2022 Prediction Quantization
Code Code Available 05 Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation Mar 31, 2024 Language Modeling Language Modelling
Code Code Available 05