Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla May 31, 2021 Deep Learning speech-recognition
Code Code Available 1Fine-Grained and Interpretable Neural Speech Editing Jul 7, 2024 Data Augmentation Speech Synthesis
Code Code Available 1InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems Jun 19, 2025 Benchmarking Descriptive
Code Code Available 1UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts Apr 29, 2024 Contrastive Learning Speech Synthesis
Code Code Available 1Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech Oct 1, 2023 speech-recognition Speech Recognition
Code Code Available 1Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding Mar 2, 2023 Speech Synthesis text-to-speech
Code Code Available 1Exploring Transfer Learning for Low Resource Emotional TTS Jan 14, 2019 Deep Learning Emotional Speech Synthesis
Code Code Available 1TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese May 11, 2020 Denoising Speech Synthesis
Code Code Available 1AutoDiff: combining Auto-encoder and Diffusion model for tabular data synthesizing Oct 24, 2023 Language Modeling Language Modelling
Code Code Available 1Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder Nov 7, 2022 Speech Synthesis text-to-speech
Code Code Available 1AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder Jan 9, 2025 Pitch Classification Pitch control
Code Code Available 1Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis May 26, 2023 Decoder Speech Synthesis
Code Code Available 1Fine-grained style control in Transformer-based Text-to-speech Synthesis Oct 12, 2021 Inductive Bias Speech Synthesis
Code Code Available 1End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions May 19, 2022 Speech Synthesis Style Transfer
Code Code Available 1A Neuro-AI Interface for Evaluating Generative Adversarial Networks Mar 5, 2020 Speech Synthesis
Code Code Available 1FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection Oct 18, 2021 Speech Synthesis Synthetic Speech Detection
Code Code Available 1End-to-End Adversarial Text-to-Speech Jun 5, 2020 Adversarial Text Dynamic Time Warping
Code Code Available 1GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models Oct 11, 2022 Disentanglement Generative Adversarial Network
Code Code Available 1Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling Dec 19, 2023 Contrastive Learning Speech Synthesis
Code Code Available 1Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion Aug 13, 2020 Speech Synthesis text-to-speech
Code Code Available 1FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis Jun 29, 2021 Speech Synthesis text-to-speech
Code Code Available 1Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System Nov 21, 2022 GPU Speech Synthesis
Code Code Available 1EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels May 22, 2023 Expressive Speech Synthesis Speech Synthesis
Code Code Available 1Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech Feb 27, 2023 Speech Synthesis text-to-speech
Code Code Available 1EfficientNet-Absolute Zero for Continuous Speech Keyword Spotting Dec 31, 2020 Keyword Spotting Keyword Spotting CSS
Code Code Available 1Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis Jun 8, 2019 Expressive Speech Synthesis Speech Synthesis
Code Code Available 1KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset Apr 17, 2021 Speech Synthesis text-to-speech
Code Code Available 1Dynamical Variational Autoencoders: A Comprehensive Review Aug 28, 2020 3D Human Dynamics Resynthesis
Code Code Available 1dMel: Speech Tokenization made Simple Jul 22, 2024 Decoder Language Modeling
Code Code Available 1EdiTTS: Score-based Editing for Controllable Text-to-Speech Oct 6, 2021 Speech Synthesis Speech-to-Text
Code Code Available 1Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm Oct 21, 2020 speaker-diarization Speaker Diarization
Code Code Available 1Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis May 17, 2020 Lip Reading Lip to Speech Synthesis
Code Code Available 1Digital Voicing of Silent Speech Oct 6, 2020 Electromyography (EMG) Speech Synthesis
Code Code Available 1Disentanglement in a GAN for Unconditional Speech Synthesis Jul 4, 2023 Disentanglement Generative Adversarial Network
Code Code Available 1Effective Deep Learning Models for Automatic Diacritization of Arabic Text Nov 1, 2020 Arabic Text Diacritization Decoder
Code Code Available 1EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech Jun 28, 2023 Emotion Recognition Speech Synthesis
Code Code Available 1FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Jun 8, 2020 Knowledge Distillation Speech Synthesis
Code Code Available 1DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training Jul 31, 2023 Denoising Expressive Speech Synthesis
Code Code Available 1MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis Oct 8, 2019 CPU GPU
Code Code Available 1Bts-e: Audio deepfake detection using breathing-talking-silence encoder May 5, 2023 Audio Deepfake Detection DeepFake Detection
Code Code Available 1Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data May 18, 2023 Speech Enhancement Speech Synthesis
Code Code Available 1Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models May 21, 2025 Bayesian Optimization Speech Synthesis
Code Code Available 1Detection of Prosodic Boundaries in Speech Using Wav2Vec 2.0 Sep 29, 2022 Sentence Speech Synthesis
Code Code Available 1Developing multilingual speech synthesis system for Ojibwe, Mi'kmaq, and Maliseet Feb 4, 2025 Speech Synthesis text-to-speech
Code Code Available 1Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme Sep 28, 2021 Speech Synthesis Voice Conversion
Code Code Available 1Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis May 9, 2022 Deep Learning Semantic Communication
Code Code Available 1Deep Learning Based Assessment of Synthetic Speech Naturalness Apr 23, 2021 Deep Learning Prediction
Code Code Available 1Deep Speech Synthesis from Articulatory Representations Sep 13, 2022 Speech Synthesis
Code Code Available 1ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation May 29, 2023 Speech Synthesis text-to-speech
Code Code Available 1Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding Aug 12, 2020 Speech Synthesis text-to-speech
Code Code Available 1