PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping Nov 8, 2022 Generative Adversarial Network Speech Synthesis
Code Code Available 1Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla May 31, 2021 Deep Learning speech-recognition
Code Code Available 1Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models May 21, 2025 Bayesian Optimization Speech Synthesis
Code Code Available 1Digital Voicing of Silent Speech Oct 6, 2020 Electromyography (EMG) Speech Synthesis
Code Code Available 1RAD-TTS: Parallel Flow-Based TTS with Robust Alignment Learning and Diverse Synthesis Jun 2, 2021 Diversity Rhythm
Code Code Available 1Disentanglement in a GAN for Unconditional Speech Synthesis Jul 4, 2023 Disentanglement Generative Adversarial Network
Code Code Available 1Effective Deep Learning Models for Automatic Diacritization of Arabic Text Nov 1, 2020 Arabic Text Diacritization Decoder
Code Code Available 1Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis Jan 11, 2025 Attribute Benchmarking
Code Code Available 1Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme Sep 28, 2021 Speech Synthesis Voice Conversion
Code Code Available 1Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data May 18, 2023 Speech Enhancement Speech Synthesis
Code Code Available 1DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding Aug 15, 2023 Speech Synthesis
Code Code Available 1SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing Nov 4, 2022 Diversity Speaker Verification
Code Code Available 1CDPAM: Contrastive learning for perceptual audio similarity Feb 9, 2021 Contrastive Learning Speech Enhancement
Code Code Available 1A Resource for Computational Experiments on Mapudungun Dec 4, 2019 Machine Translation speech-recognition
Code Code Available 1ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation May 29, 2023 Speech Synthesis text-to-speech
Code Code Available 1Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding Aug 12, 2020 Speech Synthesis text-to-speech
Code Code Available 1SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers Nov 15, 2024 Image Generation Speech Synthesis
Code Code Available 1DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training Jul 31, 2023 Denoising Expressive Speech Synthesis
Code Code Available 1DiffWave: A Versatile Diffusion Model for Audio Synthesis Sep 21, 2020 Audio Synthesis Diversity
Code Code Available 1Articulation GAN: Unsupervised modeling of articulatory learning Oct 27, 2022 Generative Adversarial Network Speech Synthesis
Code Code Available 1Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis Jun 8, 2019 Expressive Speech Synthesis Speech Synthesis
Code Code Available 1SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing Oct 14, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Deep Speech Synthesis from MRI-Based Articulatory Representations Jul 5, 2023 Computational Efficiency Denoising
Code Code Available 1Deep Speech Synthesis from Articulatory Representations Sep 13, 2022 Speech Synthesis
Code Code Available 1Generative Expressive Conversational Speech Synthesis Jul 31, 2024 Speech Synthesis
Code Code Available 1Detection of Prosodic Boundaries in Speech Using Wav2Vec 2.0 Sep 29, 2022 Sentence Speech Synthesis
Code Code Available 1ArTST: Arabic Text and Speech Transformer Oct 25, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Deep Learning Based Assessment of Synthetic Speech Naturalness Apr 23, 2021 Deep Learning Prediction
Code Code Available 1Synthesizing Speech from Intracranial Depth Electrodes using an Encoder-Decoder Framework Nov 2, 2021 Decoder EEG
Code Code Available 1Tacotron: Towards End-to-End Speech Synthesis Mar 29, 2017 Audio Synthesis Speech Synthesis
Code Code Available 1Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech Oct 8, 2021 Emotion Interpretation Expressive Speech Synthesis
Code Code Available 1Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis May 9, 2022 Deep Learning Semantic Communication
Code Code Available 1A Spectral Energy Distance for Parallel Speech Synthesis Aug 3, 2020 scoring rule Speech Synthesis
Code Code Available 1Developing multilingual speech synthesis system for Ojibwe, Mi'kmaq, and Maliseet Feb 4, 2025 Speech Synthesis text-to-speech
Code Code Available 1ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed Sep 23, 2022 Pitch control Speech Synthesis
Code Code Available 1Cross-modal information fusion for voice spoofing detection Feb 1, 2023 Automatic Speech Recognition fake voice detection
Code Code Available 1EfficientNet-Absolute Zero for Continuous Speech Keyword Spotting Dec 31, 2020 Keyword Spotting Keyword Spotting CSS
Code Code Available 1Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition Mar 29, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint May 10, 2020 Speaker Verification Speech Synthesis
Code Code Available 1Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Dec 16, 2017 Speech Synthesis
Code Code Available 1A Machine of Few Words -- Interactive Speaker Recognition with Reinforcement Learning Aug 7, 2020 Decision Making reinforcement-learning
— Unverified 0A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond Feb 17, 2025 Contrastive Learning EEG
— Unverified 0Constructive Interaction for Talking about Interesting Topics May 1, 2012 Management Speech Recognition
— Unverified 0Construction of English-French Multimodal Affective Conversational Corpus from TV Dramas May 1, 2018 Emotion Recognition Speech Recognition
— Unverified 0A Survey of Voice Translation Methodologies - Acoustic Dialect Decoder Oct 13, 2016 Decoder Sentence
— Unverified 0Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input Feb 19, 2021 Language Modeling Language Modelling
— Unverified 0Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding Oct 17, 2024 Speech Synthesis
— Unverified 0Conditioning Sequence-to-sequence Networks with Learned Activations Sep 29, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Contextual Expressive Text-to-Speech Nov 26, 2022 Speech Synthesis text-to-speech
— Unverified 0Conditional Spoken Digit Generation with StyleGAN Sep 15, 2020 Image Generation Speech Synthesis
— Unverified 0