KazEmoTTS: A Dataset for Kazakh Emotional Text-to-Speech Synthesis Apr 1, 2024 Speech Synthesis text-to-speech
Code Code Available 1Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling Apr 1, 2024 Speaker Identification Speech Synthesis
— Unverified 0CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models Mar 31, 2024 Denoising Speech Synthesis
Code Code Available 2Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation Mar 31, 2024 Language Modeling Language Modelling
Code Code Available 0Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator Mar 25, 2024 Data Augmentation Generative Adversarial Network
— Unverified 0M^3AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset Mar 21, 2024 Diversity Script Generation
— Unverified 0An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis Mar 19, 2024 In-Context Learning Speech Synthesis
— Unverified 0EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech Mar 13, 2024 GPU Speech Synthesis
— Unverified 0RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction Mar 8, 2024 Audio Generation Computational Efficiency
Code Code Available 2NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models Mar 5, 2024 Quantization Speech Synthesis
Code Code Available 3Towards Accurate Lip-to-Speech Synthesis in-the-Wild Mar 2, 2024 Language Modelling Lip to Speech Synthesis
— Unverified 0VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis Mar 1, 2024 Speech Synthesis
— Unverified 0Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data Feb 29, 2024 Representation Learning Speech Synthesis
— Unverified 0Towards Decoding Brain Activity During Passive Listening of Speech Feb 26, 2024 Brain Computer Interface Speech Synthesis
Code Code Available 0Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting Feb 19, 2024 Language Modeling Language Modelling
Code Code Available 0Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model Feb 16, 2024 Denoising Speech Enhancement
— Unverified 0Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis Feb 11, 2024 Rhythm Speaker Identification
— Unverified 0EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks Jan 31, 2024 Audio Generation Speech Synthesis
Code Code Available 2SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition Jan 31, 2024 Decoder Language Modeling
— Unverified 0SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis Jan 30, 2024 Generative Adversarial Network Speech Synthesis
— Unverified 0MunTTS: A Text-to-Speech System for Mundari Jan 28, 2024 Speech Synthesis text-to-speech
— Unverified 0Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis Jan 22, 2024 Speaker Verification Speech Synthesis
— Unverified 0Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis Jan 19, 2024 CPU Speech Synthesis
— Unverified 0ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion Diarization for Emotional Speech Synthesis Jan 16, 2024 Denoising Emotional Speech Synthesis
— Unverified 0Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters Jan 10, 2024 Self-Supervised Learning Speech Enhancement
— Unverified 0StreamVC: Real-Time Low-Latency Voice Conversion Jan 5, 2024 Speech Synthesis Voice Conversion
— Unverified 0Incremental FastPitch: Chunk-based High Quality Text to Speech Jan 3, 2024 Speech Synthesis text-to-speech
— Unverified 0Boosting Large Language Model for Speech Synthesis: An Empirical Study Dec 30, 2023 Language Modeling Language Modelling
— Unverified 0Normalization of Lithuanian Text Using Regular Expressions Dec 29, 2023 Speech Synthesis Text Normalization
— Unverified 0Creating New Voices using Normalizing Flows Dec 22, 2023 Speech Synthesis text-to-speech
— Unverified 0BrainTalker: Low-Resource Brain-to-Speech Synthesis with Transfer Learning using Wav2Vec 2.0 Dec 21, 2023 Speech Synthesis Transfer Learning
— Unverified 0Evaluating Speech-in-Speech Perception via a Humanoid Robot Dec 19, 2023 Speech Synthesis
— Unverified 0Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling Dec 19, 2023 Contrastive Learning Speech Synthesis
Code Code Available 1StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis Dec 19, 2023 Decoder Speech Synthesis
— Unverified 0MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis Dec 17, 2023 Speech Synthesis Style Transfer
— Unverified 0CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis Dec 16, 2023 Contrastive Learning Self-Supervised Learning
— Unverified 0What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection Dec 15, 2023 Audio Deepfake Detection Continual Learning
Code Code Available 1Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism Dec 11, 2023 Face Generation Lip Reading
Code Code Available 1Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks Dec 10, 2023 Representation Learning Speech Synthesis
— Unverified 0An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis Dec 8, 2023 Benchmarking Quantization
— Unverified 0Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis Dec 6, 2023 Speech Synthesis text-to-speech
— Unverified 0Code-Mixed Text to Speech Synthesis under Low-Resource Constraints Dec 2, 2023 Speech Synthesis text-to-speech
— Unverified 0Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech Nov 24, 2023 Dimensionality Reduction Emotion Classification
Code Code Available 1Guided Flows for Generative Modeling and Decision Making Nov 22, 2023 Conditional Image Generation Decision Making
— Unverified 0HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis Nov 21, 2023 Speech Synthesis Super-Resolution
Code Code Available 3APNet2: High-quality and High-efficiency Neural Vocoder with Direct Prediction of Amplitude and Phase Spectra Nov 20, 2023 Speech Synthesis
Code Code Available 1ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis Nov 20, 2023 Speech Synthesis
— Unverified 0LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement Nov 17, 2023 Ensemble Learning Language Modelling
— Unverified 0ChatGPT in the context of precision agriculture data analytics Nov 10, 2023 Language Modelling speech-recognition
Code Code Available 0Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning Nov 7, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1