Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting Feb 19, 2024 Language Modeling Language Modelling
Code Code Available 0Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model Feb 16, 2024 Denoising Speech Enhancement
— Unverified 0Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis Feb 11, 2024 Rhythm Speaker Identification
— Unverified 0SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition Jan 31, 2024 Decoder Language Modeling
— Unverified 0SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis Jan 30, 2024 Generative Adversarial Network Speech Synthesis
— Unverified 0MunTTS: A Text-to-Speech System for Mundari Jan 28, 2024 Speech Synthesis text-to-speech
— Unverified 0Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis Jan 22, 2024 Speaker Verification Speech Synthesis
— Unverified 0Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis Jan 19, 2024 CPU Speech Synthesis
— Unverified 0ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion Diarization for Emotional Speech Synthesis Jan 16, 2024 Denoising Emotional Speech Synthesis
— Unverified 0Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters Jan 10, 2024 Self-Supervised Learning Speech Enhancement
— Unverified 0StreamVC: Real-Time Low-Latency Voice Conversion Jan 5, 2024 Speech Synthesis Voice Conversion
— Unverified 0Incremental FastPitch: Chunk-based High Quality Text to Speech Jan 3, 2024 Speech Synthesis text-to-speech
— Unverified 0Boosting Large Language Model for Speech Synthesis: An Empirical Study Dec 30, 2023 Language Modeling Language Modelling
— Unverified 0Normalization of Lithuanian Text Using Regular Expressions Dec 29, 2023 Speech Synthesis Text Normalization
— Unverified 0Creating New Voices using Normalizing Flows Dec 22, 2023 Speech Synthesis text-to-speech
— Unverified 0BrainTalker: Low-Resource Brain-to-Speech Synthesis with Transfer Learning using Wav2Vec 2.0 Dec 21, 2023 Speech Synthesis Transfer Learning
— Unverified 0StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis Dec 19, 2023 Decoder Speech Synthesis
— Unverified 0Evaluating Speech-in-Speech Perception via a Humanoid Robot Dec 19, 2023 Speech Synthesis
— Unverified 0MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis Dec 17, 2023 Speech Synthesis Style Transfer
— Unverified 0CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis Dec 16, 2023 Contrastive Learning Self-Supervised Learning
— Unverified 0Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks Dec 10, 2023 Representation Learning Speech Synthesis
— Unverified 0An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis Dec 8, 2023 Benchmarking Quantization
— Unverified 0Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis Dec 6, 2023 Speech Synthesis text-to-speech
— Unverified 0Code-Mixed Text to Speech Synthesis under Low-Resource Constraints Dec 2, 2023 Speech Synthesis text-to-speech
— Unverified 0Guided Flows for Generative Modeling and Decision Making Nov 22, 2023 Conditional Image Generation Decision Making
— Unverified 0ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis Nov 20, 2023 Speech Synthesis
— Unverified 0LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement Nov 17, 2023 Ensemble Learning Language Modelling
— Unverified 0ChatGPT in the context of precision agriculture data analytics Nov 10, 2023 Language Modelling speech-recognition
Code Code Available 0On the Opportunities of Green Computing: A Survey Nov 1, 2023 Fairness Speech Synthesis
— Unverified 0Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised Contrastive Learning Oct 26, 2023 Contrastive Learning Expressive Speech Synthesis
— Unverified 0Controllable Generation of Artificial Speaker Embeddings through Discovery of Principal Directions Oct 26, 2023 Speech Synthesis
— Unverified 0Generative Pre-training for Speech with Flow Matching Oct 25, 2023 Speech Enhancement Speech Synthesis
— Unverified 0Energy-Based Models For Speech Synthesis Oct 19, 2023 Speech Synthesis
— Unverified 0SelfVC: Voice Conversion With Iterative Refinement using Self Transformations Oct 14, 2023 Self-Supervised Learning Speaker Verification
— Unverified 0Attentive Multi-Layer Perceptron for Non-autoregressive Generation Oct 14, 2023 Machine Translation Speech Synthesis
Code Code Available 0Speaking rate attention-based duration prediction for speed control TTS Oct 13, 2023 Attribute Speech Synthesis
— Unverified 0Privacy-oriented manipulation of speaker representations Oct 10, 2023 Speaker Recognition Speech Synthesis
— Unverified 0Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting Oct 8, 2023 Prediction Speech Synthesis
Code Code Available 0Neutral TTS Female Voice Corpus in Brazilian Portuguese Oct 8, 2023 Speech Synthesis text-to-speech
— Unverified 0Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis Oct 5, 2023 Data Augmentation Speech Synthesis
— Unverified 0The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains Oct 4, 2023 Speech Synthesis text-to-speech
— Unverified 0High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models Sep 27, 2023 All Speech Synthesis
— Unverified 0Collaborative Watermarking for Adversarial Speech Synthesis Sep 26, 2023 Speaker Verification Speech Synthesis
— Unverified 0Face-StyleSpeech: Enhancing Zero-shot Speech Synthesis from Face Images with Improved Face-to-Speech Mapping Sep 25, 2023 Speech Synthesis text-to-speech
— Unverified 0DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis Sep 22, 2023 Denoising Speech Synthesis
— Unverified 0A Discourse-level Multi-scale Prosodic Model for Fine-grained Emotion Analysis Sep 21, 2023 Emotion Recognition Speech Synthesis
— Unverified 0Speak While You Think: Streaming Speech Synthesis During Text Generation Sep 20, 2023 Speech Synthesis Text Generation
— Unverified 0Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition Sep 19, 2023 Data Augmentation Emotion Recognition
— Unverified 0Exploring Speech Enhancement for Low-resource Speech Synthesis Sep 19, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Speech Synthesis By Unrolling Diffusion Process using Neural Network Layers Sep 18, 2023 Denoising Speech Synthesis
— Unverified 0