Detection of Prosodic Boundaries in Speech Using Wav2Vec 2.0 Sep 29, 2022 Sentence Speech Synthesis
Code Code Available 15 Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla May 31, 2021 Deep Learning speech-recognition
Code Code Available 15 Developing multilingual speech synthesis system for Ojibwe, Mi'kmaq, and Maliseet Feb 4, 2025 Speech Synthesis text-to-speech
Code Code Available 15 Learning pronunciation from a foreign language in speech synthesis networks Nov 23, 2018 Speech Synthesis
Code Code Available 15 RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis Jun 15, 2021 speech-recognition Speech Recognition
Code Code Available 15 Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models May 21, 2025 Bayesian Optimization Speech Synthesis
Code Code Available 15 RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech Synthesis Dec 15, 2022 Relation Speech Synthesis
Code Code Available 15 SafeSpeech: Robust and Universal Voice Protection Against Malicious Speech Synthesis Apr 14, 2025 Face Swapping Speech Synthesis
Code Code Available 15 Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis Jan 11, 2025 Attribute Benchmarking
Code Code Available 15 Deep Speech Synthesis from MRI-Based Articulatory Representations Jul 5, 2023 Computational Efficiency Denoising
Code Code Available 15 RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks Jun 14, 2022 Action Segmentation Instance Segmentation
Code Code Available 15 ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation May 29, 2023 Speech Synthesis text-to-speech
Code Code Available 15 CDPAM: Contrastive learning for perceptual audio similarity Feb 9, 2021 Contrastive Learning Speech Enhancement
Code Code Available 15 A Resource for Computational Experiments on Mapudungun Dec 4, 2019 Machine Translation speech-recognition
Code Code Available 15 Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding Aug 12, 2020 Speech Synthesis text-to-speech
Code Code Available 15 Deep Speech Synthesis from Articulatory Representations Sep 13, 2022 Speech Synthesis
Code Code Available 15 Mitigating Unauthorized Speech Synthesis for Voice Protection Oct 28, 2024 Data Augmentation Face Swapping
Code Code Available 15 Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings Oct 7, 2021 Language Modeling Language Modelling
Code Code Available 15 ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech Jul 19, 2018 Speech Synthesis text-to-speech
Code Code Available 15 Articulation GAN: Unsupervised modeling of articulatory learning Oct 27, 2022 Generative Adversarial Network Speech Synthesis
Code Code Available 15 SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing Nov 4, 2022 Diversity Speaker Verification
Code Code Available 15 MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset Dec 11, 2022 Speech Synthesis text-to-speech
Code Code Available 15 TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese May 11, 2020 Denoising Speech Synthesis
Code Code Available 15 Deep Learning Based Assessment of Synthetic Speech Naturalness Apr 23, 2021 Deep Learning Prediction
Code Code Available 15 In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data Apr 4, 2019 Speech Synthesis text-to-speech
Code Code Available 15 Cross-modal information fusion for voice spoofing detection Feb 1, 2023 Automatic Speech Recognition fake voice detection
Code Code Available 15 Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech Oct 8, 2021 Emotion Interpretation Expressive Speech Synthesis
Code Code Available 15 Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis May 9, 2022 Deep Learning Semantic Communication
Code Code Available 15 Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Dec 16, 2017 Speech Synthesis
Code Code Available 15 Region-Based Optimization in Continual Learning for Audio Deepfake Detection Dec 16, 2024 Audio Deepfake Detection Continual Learning
Code Code Available 15 ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed Sep 23, 2022 Pitch control Speech Synthesis
Code Code Available 15 DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training Jul 31, 2023 Denoising Expressive Speech Synthesis
Code Code Available 15 A Spectral Energy Distance for Parallel Speech Synthesis Aug 3, 2020 scoring rule Speech Synthesis
Code Code Available 15 Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization May 1, 2022 Speech Synthesis
Code Code Available 15 One TTS Alignment To Rule Them All Aug 23, 2021 All Speech Synthesis
Code Code Available 15 WaveGrad: Estimating Gradients for Waveform Generation Sep 2, 2020 Speech Synthesis Text-To-Speech Synthesis
Code Code Available 15 Assem-VC: Realistic Voice Conversion by Assembling Modern Speech Synthesis Techniques Apr 2, 2021 Decoder Rhythm
Code Code Available 15 One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech Aug 3, 2020 Meta-Learning Speech Synthesis
Code Code Available 15 Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions Aug 9, 2020 Speech Synthesis text-to-speech
Code Code Available 15 TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis Nov 24, 2020 Generative Adversarial Network Speech Synthesis
Code Code Available 15 Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning Models Jun 17, 2025 Kolmogorov-Arnold Networks Self-Supervised Learning
Code Code Available 05 A Critical Review of Recurrent Neural Networks for Sequence Learning May 29, 2015 Handwriting Recognition Image Captioning
Code Code Available 05 PromptTTS: Controllable Text-to-Speech with Text Descriptions Nov 22, 2022 Decoder Speech Synthesis
Code Code Available 05 Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis Apr 26, 2021 Language Modeling Language Modelling
Code Code Available 05 Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish May 31, 2022 Machine Translation Speech Synthesis
Code Code Available 05 Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting Oct 8, 2023 Prediction Speech Synthesis
Code Code Available 05 Parallel WaveNet: Fast High-Fidelity Speech Synthesis Nov 28, 2017 Speech Synthesis Vocal Bursts Intensity Prediction
Code Code Available 05 RawNet: Fast End-to-End Neural Vocoder Apr 10, 2019 Speech Synthesis
Code Code Available 05 Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis Feb 28, 2020 Speech Synthesis text-to-speech
Code Code Available 05 OmniDRCA: Parallel Speech-Text Foundation Model via Dual-Resolution Speech Representations and Contrastive Alignment Jun 11, 2025 cross-modal alignment Question Answering
Code Code Available 05