Pronunciation Deviation Analysis Through Voice Cloning and Acoustic Comparison Jul 15, 2025 Voice Cloning
— Unverified 0De-AntiFake: Rethinking the Protective Perturbations Against Voice Cloning Attacks Jul 3, 2025 Voice Cloning
— Unverified 0Few-Shot Speech Deepfake Detection Adaptation with Gaussian Processes May 29, 2025 Audio Deepfake Detection DeepFake Detection
Code Code Available 0Voice Adaptation for Swiss German May 28, 2025 Voice Cloning
— Unverified 0Phir Hera Fairy: An English Fairytaler is a Strong Faker of Fluent Speech in Low-Resource Indian Languages May 27, 2025 Synthetic Data Generation Voice Cloning
— Unverified 0VoiceMark: Zero-Shot Voice Cloning-Resistant Watermarking Approach Leveraging Speaker-Specific Latents May 27, 2025 Voice Cloning
— Unverified 0CloneShield: A Framework for Universal Perturbation Against Zero-Shot Voice Cloning May 25, 2025 text-to-speech Text to Speech
— Unverified 0Beyond Face Swapping: A Diffusion-Based Digital Human Benchmark for Multimodal Deepfake Detection May 22, 2025 DeepFake Detection Face Swapping
— Unverified 0MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling May 21, 2025 Emotion Recognition Face Detection
— Unverified 0VoiceCloak: A Multi-Dimensional Defense Framework against Unauthorized Diffusion-based Voice Cloning May 18, 2025 Representation Learning Voice Cloning
— Unverified 0MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder May 12, 2025 text-to-speech Text to Speech
— Unverified 0Voice Cloning: Comprehensive Survey May 1, 2025 Survey Voice Cloning
— Unverified 0ClonEval: An Open Voice Cloning Benchmark Apr 29, 2025 text-to-speech Text to Speech
Code Code Available 0"It's not a representation of me": Examining Accent Bias and Digital Exclusion in Synthetic AI Voice Services Apr 12, 2025 Voice Cloning
— Unverified 0Empowering Global Voices: A Data-Efficient, Phoneme-Tone Adaptive Approach to High-Fidelity Speech Synthesis Apr 10, 2025 Speech Synthesis text-to-speech
— Unverified 0SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development Mar 31, 2025 Speech Synthesis Voice Cloning
Code Code Available 0SoK: How Robust is Audio Watermarking in Generative AI models? Mar 24, 2025 Voice Cloning
— Unverified 0Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens Mar 3, 2025 Attribute text-to-speech
Code Code Available 11Voice Cloning for Dysarthric Speech Synthesis: Addressing Data Scarcity in Speech-Language Pathology Mar 3, 2025 Speech Synthesis Voice Cloning
— Unverified 0Steganography Beyond Space-Time with Chain of Multimodal AI Feb 25, 2025 Face Swapping Text Generation
— Unverified 0SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation Feb 18, 2025 Voice Cloning
Code Code Available 3Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction Feb 17, 2025 Instruction Following Voice Cloning
Code Code Available 7IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System Feb 8, 2025 Decoder Language Modeling
Code Code Available 11Deepfake Technology Unveiled: The Commoditization of AI and Its Impact on Digital Trust Jan 24, 2025 Face Swapping Misinformation
— Unverified 0Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement Jan 15, 2025 Computational Efficiency CPU
— Unverified 0MARS6: A Small and Robust Hierarchical-Codec Text-to-Speech Model Jan 10, 2025 Decoder Language Modelling
— Unverified 0Advancing NAM-to-Speech Conversion with Novel Methods and the MultiNAM Dataset Dec 25, 2024 text-to-speech Text to Speech
— Unverified 0Speech Watermarking with Discrete Intermediate Representations Dec 18, 2024 Voice Cloning
— Unverified 0Parallel Stacked Aggregated Network for Voice Authentication in IoT-Enabled Smart Devices Nov 29, 2024 Voice Anti-spoofing Voice Cloning
— Unverified 0Hindi audio-video-Deepfake (HAV-DF): A Hindi language-based Audio-video Deepfake Dataset Nov 23, 2024 DeepFake Detection Face Swapping
— Unverified 0The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings Oct 31, 2024 Voice Cloning
— Unverified 0Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis Oct 30, 2024 Speech Synthesis text-to-speech
Code Code Available 2DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis Oct 14, 2024 Denoising Speaker Verification
— Unverified 0Can DeepFake Speech be Reliably Detected? Oct 9, 2024 Face Swapping Misinformation
— Unverified 0Algorithms For Automatic Accentuation And Transcription Of Russian Texts In Speech Recognition Systems Oct 3, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control Oct 1, 2024 Emotional Speech Synthesis Speech Synthesis
Code Code Available 2Augmentation through Laundering Attacks for Audio Spoof Detection Oct 1, 2024 Data Augmentation Face Swapping
— Unverified 0LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation Sep 23, 2024 Language Modeling Language Modelling
Code Code Available 1Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space Sep 19, 2024 Automatic Speech Recognition Data Augmentation
— Unverified 0Multi-modal Adversarial Training for Zero-Shot Voice Cloning Aug 28, 2024 Decoder text-to-speech
— Unverified 0Is Audio Spoof Detection Robust to Laundering Attacks? Aug 27, 2024 Voice Cloning
Code Code Available 0kNN Retrieval for Simple and Effective Zero-Shot Multi-speaker Text-to-Speech Aug 20, 2024 Retrieval Self-Supervised Learning
— Unverified 0Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language Aug 19, 2024 Transfer Learning Voice Cloning
— Unverified 0WavLM model ensemble for audio deepfake detection Aug 14, 2024 Audio Deepfake Detection Data Augmentation
Code Code Available 0Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems Jul 18, 2024 Speech-to-Speech Translation Voice Cloning
— Unverified 0CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens Jul 7, 2024 Language Modelling Large Language Model
Code Code Available 11FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs Jul 4, 2024 Emotion Recognition Event Detection
Code Code Available 11A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge Jun 22, 2024 Speech Synthesis text-to-speech
— Unverified 0DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing Jun 13, 2024 Language Modeling Language Modelling
— Unverified 0Spoken Language Corpora Augmentation with Domain-Specific Voice-Cloned Speech Jun 11, 2024 speech-recognition Speech Recognition
— Unverified 0