RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding Jun 12, 2025 CPU Voice Conversion
— Unverified 0Training-Free Voice Conversion with Factorized Optimal Transport Jun 11, 2025 Voice Conversion
Code Code Available 1CO-VADA: A Confidence-Oriented Voice Augmentation Debiasing Approach for Fair Speech Emotion Recognition Jun 6, 2025 Emotion Recognition Fairness
— Unverified 0Towards Better Disentanglement in Non-Autoregressive Zero-Shot Expressive Voice Conversion Jun 4, 2025 Disentanglement Style Transfer
— Unverified 0StarVC: A Unified Auto-Regressive Framework for Joint Text and Speech Generation in Voice Conversion Jun 3, 2025 Voice Conversion
— Unverified 0LinearVC: Linear transformations of self-supervised features through the lens of voice conversion Jun 2, 2025 Voice Conversion
— Unverified 0Unsupervised Rhythm and Voice Conversion to Improve ASR on Dysarthric Speech Jun 2, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction Jun 2, 2025 Speech Synthesis text-to-speech
— Unverified 0Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching Jun 1, 2025 Rhythm Style Transfer
— Unverified 0PseudoVC: Improving One-shot Voice Conversion with Pseudo Paired Data Jun 1, 2025 Voice Conversion
— Unverified 0Voice Conversion Improves Cross-Domain Robustness for Spoken Arabic Dialect Identification May 30, 2025 Dialect Identification Voice Conversion
— Unverified 0Discl-VC: Disentangled Discrete Tokens and In-Context Learning for Controllable Zero-Shot Voice Conversion May 30, 2025 In-Context Learning Voice Conversion
— Unverified 0A Perception-Based L2 Speech Intelligibility Indicator: Leveraging a Rater's Shadowing and Sequence-to-sequence Voice Conversion May 30, 2025 Voice Conversion
— Unverified 0When Humans Growl and Birds Speak: High-Fidelity Voice Conversion from Human to Animal and Designed Sounds May 30, 2025 Voice Conversion
— Unverified 0REWIND: Speech Time Reversal for Enhancing Speaker Representations in Diffusion-based Voice Conversion May 27, 2025 Disentanglement Speaker Identification
— Unverified 0PromptEVC: Controllable Emotional Voice Conversion with Natural Language Prompts May 27, 2025 Diversity Rhythm
— Unverified 0VibE-SVC: Vibrato Extraction with High-frequency F0 Contour for Singing Voice Conversion May 27, 2025 Voice Conversion
— Unverified 0ArVoice: A Multi-Speaker Dataset for Arabic Speech Synthesis May 26, 2025 DeepFake Detection Face Swapping
— Unverified 0Eta-WavLM: Efficient Speaker Identity Removal in Self-Supervised Speech Representations Using a Simple Linear Equation May 25, 2025 Disentanglement Self-Supervised Learning
— Unverified 0Private kNN-VC: Interpretable Anonymization of Converted Speech May 23, 2025 Speaker anonymization Speaker Recognition
Code Code Available 0EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion May 22, 2025 Decoder Voice Conversion
— Unverified 0Towards Inclusive ASR: Investigating Voice Conversion for Dysarthric Speech Recognition in Low-Resource Languages May 20, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech May 20, 2025 Voice Conversion
— Unverified 0Investigating self-supervised features for expressive, multilingual voice conversion May 13, 2025 Self-Supervised Learning Speech Synthesis
— Unverified 0Discrete Optimal Transport and Voice Conversion May 7, 2025 Audio Generation Voice Conversion
— Unverified 0Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements Apr 27, 2025 Generative Adversarial Network Speech Synthesis
— Unverified 0FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning Apr 22, 2025 Deep Learning Speaker Verification
— Unverified 0Collective Learning Mechanism based Optimal Transport Generative Adversarial Network for Non-parallel Voice Conversion Apr 18, 2025 Generative Adversarial Network Image Generation
— Unverified 0Voice Conversion with Diverse Intonation using Conditional Variational Auto-Encoder Apr 16, 2025 Diversity Voice Conversion
— Unverified 0Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion Apr 11, 2025 Voice Conversion
— Unverified 0kNN-SVC: Robust Zero-Shot Singing Voice Conversion with Additive Synthesis and Concatenation Smoothness Optimization Apr 8, 2025 Voice Conversion
Code Code Available 1An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR Mar 11, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge Feb 18, 2025 Voice Conversion
Code Code Available 0ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech Feb 13, 2025 Adversarial Attack Adversarial Attack Detection
— Unverified 0Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement Feb 11, 2025 Disentanglement text-to-speech
— Unverified 0Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features Feb 7, 2025 Melody Extraction Self-Supervised Learning
— Unverified 0FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks Feb 6, 2025 Resynthesis Voice Conversion
— Unverified 0GenVC: Self-Supervised Zero-Shot Voice Conversion Feb 6, 2025 Voice Conversion
— Unverified 0Metis: A Foundation Speech Generation Model with Masked Generative Pre-training Feb 5, 2025 Self-Supervised Learning Speech Enhancement
Code Code Available 9VoicePrompter: Robust Zero-Shot Voice Conversion with Voice Prompt and Conditional Flow Matching Jan 29, 2025 Decoder In-Context Learning
— Unverified 0Overview of the Amphion Toolkit (v0.2) Jan 26, 2025 text-to-speech Text to Speech
Code Code Available 9Stepback: Enhanced Disentanglement for Voice Conversion via Multi-Task Learning Jan 26, 2025 Disentanglement Multi-Task Learning
— Unverified 0Generalizable Audio Deepfake Detection via Latent Space Refinement and Augmentation Jan 24, 2025 Audio Deepfake Detection DeepFake Detection
— Unverified 0Unsupervised Rhythm and Voice Conversion of Dysarthric to Healthy Speech for ASR Jan 17, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Speech Synthesis along Perceptual Voice Quality Dimensions Jan 15, 2025 Expressive Speech Synthesis Speech Synthesis
— Unverified 0Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral Narratives Jan 11, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training Jan 8, 2025 In-Context Learning Voice Conversion
— Unverified 0Generating and Detecting Various Types of Fake Image and Audio Content: A Review of Modern Deep Learning Technologies and Tools Jan 7, 2025 Face Swapping Voice Conversion
— Unverified 0AdaptVC: High Quality Voice Conversion with Adaptive Learning Jan 2, 2025 Decoder Disentanglement
— Unverified 0EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion Dec 29, 2024 Self-Supervised Learning Voice Conversion
— Unverified 0