RapFlow-TTS: Rapid and High-Fidelity Text-to-Speech with Improved Consistency Flow Matching Jun 20, 2025 Scheduling Speech Synthesis
Code Code Available 25 DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability Jun 27, 2024 Speech Synthesis text-to-speech
Code Code Available 25 Sample-Efficient Diffusion for Text-To-Speech Synthesis Sep 1, 2024 Language Modeling Language Modelling
Code Code Available 25 Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis Jul 13, 2024 Mamba speech-recognition
Code Code Available 25 CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models Mar 31, 2024 Denoising Speech Synthesis
Code Code Available 25 Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram Oct 25, 2019 Generative Adversarial Network GPU
Code Code Available 25 P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting Sep 22, 2023 Decoder Speech Synthesis
Code Code Available 25 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model May 11, 2023 Denoising GPU
Code Code Available 25 NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality May 9, 2022 Sentence Speech Synthesis
Code Code Available 25 Neural Speech Synthesis with Transformer Network Sep 19, 2018 Decoder Machine Translation
Code Code Available 25 SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis Sep 11, 2024 Decoder Speech Synthesis
Code Code Available 25 Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space May 19, 2025 Language Modeling Language Modelling
Code Code Available 25 A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech Feb 8, 2023 Code Generation Diversity
Code Code Available 25 BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network Sep 6, 2023 Generative Adversarial Network Speech Synthesis
Code Code Available 25 Efficient Neural Audio Synthesis Feb 23, 2018 Audio Synthesis CPU
Code Code Available 25 LPCNet: Improving Neural Speech Synthesis Through Linear Prediction Oct 28, 2018 Prediction Speech Synthesis
Code Code Available 25 iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform Mar 4, 2022 Speech Synthesis text-to-speech
Code Code Available 25 Improving Opus Low Bit Rate Quality with Neural Speech Synthesis Aug 10, 2020 Decoder Speech Synthesis
Code Code Available 25 LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT Oct 7, 2023 Audio captioning Automatic Speech Recognition
Code Code Available 25 HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Oct 12, 2020 CPU GPU
Code Code Available 25 HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform Sep 18, 2023 Speech Synthesis
Code Code Available 25 Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling Oct 14, 2023 Speech Synthesis text-to-speech
Code Code Available 25 Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows Mar 3, 2022 Speech Synthesis text-to-speech
Code Code Available 25 Llama-VITS: Enhancing TTS Synthesis with Semantic Awareness Apr 10, 2024 Speech Synthesis text-to-speech
Code Code Available 25 A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet Mar 28, 2019 Speech Synthesis
Code Code Available 25 EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control Oct 1, 2024 Emotional Speech Synthesis Speech Synthesis
Code Code Available 25 GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech May 15, 2022 Speech Synthesis Style Transfer
Code Code Available 25 BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis Mar 25, 2022 Image Generation Speech Synthesis
Code Code Available 25 Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis Oct 30, 2024 Speech Synthesis text-to-speech
Code Code Available 25 NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers Apr 18, 2023 In-Context Learning Speech Synthesis
Code Code Available 25 Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis May 12, 2020 Speech Synthesis Style Transfer
Code Code Available 15 Dynamical Variational Autoencoders: A Comprehensive Review Aug 28, 2020 3D Human Dynamics Resynthesis
Code Code Available 15 FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection Oct 18, 2021 Speech Synthesis Synthetic Speech Detection
Code Code Available 15 Fine-grained style control in Transformer-based Text-to-speech Synthesis Oct 12, 2021 Inductive Bias Speech Synthesis
Code Code Available 15 dMel: Speech Tokenization made Simple Jul 22, 2024 Decoder Language Modeling
Code Code Available 15 FonBund: A Library for Combining Cross-lingual Phonological Segment Data May 1, 2018 Language Modeling Language Modelling
Code Code Available 15 Disentanglement in a GAN for Unconditional Speech Synthesis Jul 4, 2023 Disentanglement Generative Adversarial Network
Code Code Available 15 Digital Voicing of Silent Speech Oct 6, 2020 Electromyography (EMG) Speech Synthesis
Code Code Available 15 EdiTTS: Score-based Editing for Controllable Text-to-Speech Oct 6, 2021 Speech Synthesis Speech-to-Text
Code Code Available 15 DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding Aug 15, 2023 Speech Synthesis
Code Code Available 15 Generative Expressive Conversational Speech Synthesis Jul 31, 2024 Speech Synthesis
Code Code Available 15 DiffWave: A Versatile Diffusion Model for Audio Synthesis Sep 21, 2020 Audio Synthesis Diversity
Code Code Available 15 Fine-Grained and Interpretable Neural Speech Editing Jul 7, 2024 Data Augmentation Speech Synthesis
Code Code Available 15 From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint May 10, 2020 Speaker Verification Speech Synthesis
Code Code Available 15 DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training Jul 31, 2023 Denoising Expressive Speech Synthesis
Code Code Available 15 FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis Jun 29, 2021 Speech Synthesis text-to-speech
Code Code Available 15 FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Jun 8, 2020 Knowledge Distillation Speech Synthesis
Code Code Available 15 Developing multilingual speech synthesis system for Ojibwe, Mi'kmaq, and Maliseet Feb 4, 2025 Speech Synthesis text-to-speech
Code Code Available 15 Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data May 18, 2023 Speech Enhancement Speech Synthesis
Code Code Available 15 Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech Oct 1, 2023 speech-recognition Speech Recognition
Code Code Available 15