Improving Accented Speech Recognition using Data Augmentation based on Unsupervised Text-to-Speech Synthesis Jul 4, 2024 Accented Speech Recognition Automatic Speech Recognition
— Unverified 0Probing the Feasibility of Multilingual Speaker Anonymization Jul 3, 2024 Speaker anonymization Speech Synthesis
— Unverified 0Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization Jul 2, 2024 Inference Optimization Speech Synthesis
— Unverified 0A Comprehensive Survey on Diffusion Models and Their Applications Jul 1, 2024 Speech Synthesis Survey
— Unverified 0Lightweight Zero-shot Text-to-Speech with Mixture of Adapters Jul 1, 2024 Decoder Speech Synthesis
— Unverified 0FLY-TTS: Fast, Lightweight and High-Quality End-to-End Text-to-Speech Synthesis Jun 30, 2024 CPU Decoder
— Unverified 0Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation Jun 25, 2024 Speech Synthesis text-to-speech
— Unverified 0High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model Jun 25, 2024 Computational Efficiency Language Modeling
— Unverified 0Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment Jun 25, 2024 Decoder Language Modeling
— Unverified 0Towards Zero-Shot Text-To-Speech for Arabic Dialects Jun 24, 2024 Dialect Identification Speech Synthesis
— Unverified 0One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection Jun 24, 2024 Audio Deepfake Detection DeepFake Detection
Code Code Available 0A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge Jun 22, 2024 Speech Synthesis text-to-speech
— Unverified 0A Mel Spectrogram Enhancement Paradigm Based on CWT in Speech Synthesis Jun 18, 2024 Decoder Speech Synthesis
— Unverified 01000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis Jun 17, 2024 Diversity Speech Synthesis
— Unverified 0Multi-Scale Accent Modeling and Disentangling for Multi-Speaker Multi-Accent Text-to-Speech Synthesis Jun 16, 2024 Disentanglement Speech Synthesis
— Unverified 0ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis Jun 13, 2024 Quantization Speech Synthesis
— Unverified 0PolySpeech: Exploring Unified Multitask Speech Models for Competitiveness with Single-task Models Jun 12, 2024 Language Modeling Language Modelling
— Unverified 0VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment Jun 12, 2024 Quantization Speech Synthesis
— Unverified 0CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from Codec-Based Speech Synthesis Systems Jun 11, 2024 Audio Synthesis Face Swapping
— Unverified 0Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data? Jun 11, 2024 Contrastive Learning Speech Synthesis
— Unverified 0Meta Learning Text-to-Speech Synthesis in over 7000 Languages Jun 10, 2024 Meta-Learning Speech Synthesis
— Unverified 0JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis Jun 10, 2024 Speech Synthesis
— Unverified 0Text-aware and Context-aware Expressive Audiobook Speech Synthesis Jun 9, 2024 Contrastive Learning Language Modeling
— Unverified 0VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers Jun 8, 2024 Speech Synthesis text-to-speech
— Unverified 0Autoregressive Diffusion Transformer for Text-to-Speech Synthesis Jun 8, 2024 Audio Generation Decoder
— Unverified 0Spectral Codecs: Improving Non-Autoregressive Speech Synthesis with Spectrogram-Based Audio Codecs Jun 7, 2024 Quantization Speech Synthesis
— Unverified 0Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model Jun 6, 2024 Language Modeling Language Modelling
— Unverified 0Style Mixture of Experts for Expressive Text-To-Speech Synthesis Jun 5, 2024 Mixture-of-Experts Speech Synthesis
— Unverified 0Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis Jun 4, 2024 In-Context Learning Language Modeling
— Unverified 0Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training Jun 3, 2024 Speech Synthesis text-to-speech
— Unverified 0Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback Jun 2, 2024 Speech Synthesis text-to-speech
— Unverified 0Multilingual Prosody Transfer: Comparing Supervised & Transfer Learning May 23, 2024 Speech Synthesis text-to-speech
— Unverified 0DLPO: Diffusion Model Loss-Guided Reinforcement Learning for Fine-Tuning Text-to-Speech Diffusion Models May 23, 2024 Image Generation reinforcement-learning
— Unverified 0Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model May 16, 2024 Hallucination Language Modeling
— Unverified 0Expressivity and Speech Synthesis Apr 30, 2024 Expressive Speech Synthesis Speech Synthesis
— Unverified 0Retrieval-Augmented Audio Deepfake Detection Apr 22, 2024 Audio Deepfake Detection DeepFake Detection
— Unverified 0Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications Apr 21, 2024 Computational Efficiency Model Optimization
— Unverified 0RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis Apr 4, 2024 Language Modeling Language Modelling
— Unverified 0PSCodec: A Series of High-Fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders Apr 3, 2024 Representation Learning Speaker Verification
— Unverified 0Leveraging the Interplay Between Syntactic and Acoustic Cues for Optimizing Korean TTS Pause Formation Apr 3, 2024 Speech Synthesis
— Unverified 0Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling Apr 1, 2024 Speaker Identification Speech Synthesis
— Unverified 0Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation Mar 31, 2024 Language Modeling Language Modelling
Code Code Available 0Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator Mar 25, 2024 Data Augmentation Generative Adversarial Network
— Unverified 0M^3AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset Mar 21, 2024 Diversity Script Generation
— Unverified 0An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis Mar 19, 2024 In-Context Learning Speech Synthesis
— Unverified 0EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech Mar 13, 2024 GPU Speech Synthesis
— Unverified 0Towards Accurate Lip-to-Speech Synthesis in-the-Wild Mar 2, 2024 Language Modelling Lip to Speech Synthesis
— Unverified 0VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis Mar 1, 2024 Speech Synthesis
— Unverified 0Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data Feb 29, 2024 Representation Learning Speech Synthesis
— Unverified 0Towards Decoding Brain Activity During Passive Listening of Speech Feb 26, 2024 Brain Computer Interface Speech Synthesis
Code Code Available 0