Voice Cloning

Voice cloning is a highly desired feature for personalized speech interfaces. Neural voice cloning system learns to synthesize a person’s voice from only a few audio samples.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 112 papers

Title	Date	Tasks	Status	Hype
Pronunciation Deviation Analysis Through Voice Cloning and Acoustic Comparison	Jul 15, 2025	Voice Cloning	—Unverified	0
De-AntiFake: Rethinking the Protective Perturbations Against Voice Cloning Attacks	Jul 3, 2025	Voice Cloning	—Unverified	0
Few-Shot Speech Deepfake Detection Adaptation with Gaussian Processes	May 29, 2025	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	0
Voice Adaptation for Swiss German	May 28, 2025	Voice Cloning	—Unverified	0
Phir Hera Fairy: An English Fairytaler is a Strong Faker of Fluent Speech in Low-Resource Indian Languages	May 27, 2025	Synthetic Data GenerationVoice Cloning	—Unverified	0
VoiceMark: Zero-Shot Voice Cloning-Resistant Watermarking Approach Leveraging Speaker-Specific Latents	May 27, 2025	Voice Cloning	—Unverified	0
CloneShield: A Framework for Universal Perturbation Against Zero-Shot Voice Cloning	May 25, 2025	text-to-speechText to Speech	—Unverified	0
Beyond Face Swapping: A Diffusion-Based Digital Human Benchmark for Multimodal Deepfake Detection	May 22, 2025	DeepFake DetectionFace Swapping	—Unverified	0
MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling	May 21, 2025	Emotion RecognitionFace Detection	—Unverified	0
VoiceCloak: A Multi-Dimensional Defense Framework against Unauthorized Diffusion-based Voice Cloning	May 18, 2025	Representation LearningVoice Cloning	—Unverified	0
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder	May 12, 2025	text-to-speechText to Speech	—Unverified	0
Voice Cloning: Comprehensive Survey	May 1, 2025	SurveyVoice Cloning	—Unverified	0
ClonEval: An Open Voice Cloning Benchmark	Apr 29, 2025	text-to-speechText to Speech	CodeCode Available	0
"It's not a representation of me": Examining Accent Bias and Digital Exclusion in Synthetic AI Voice Services	Apr 12, 2025	Voice Cloning	—Unverified	0
Empowering Global Voices: A Data-Efficient, Phoneme-Tone Adaptive Approach to High-Fidelity Speech Synthesis	Apr 10, 2025	Speech Synthesistext-to-speech	—Unverified	0
SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development	Mar 31, 2025	Speech SynthesisVoice Cloning	CodeCode Available	0
SoK: How Robust is Audio Watermarking in Generative AI models?	Mar 24, 2025	Voice Cloning	—Unverified	0
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens	Mar 3, 2025	Attributetext-to-speech	CodeCode Available	11
Voice Cloning for Dysarthric Speech Synthesis: Addressing Data Scarcity in Speech-Language Pathology	Mar 3, 2025	Speech SynthesisVoice Cloning	—Unverified	0
Steganography Beyond Space-Time with Chain of Multimodal AI	Feb 25, 2025	Face SwappingText Generation	—Unverified	0
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation	Feb 18, 2025	Voice Cloning	CodeCode Available	3
Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction	Feb 17, 2025	Instruction FollowingVoice Cloning	CodeCode Available	7
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System	Feb 8, 2025	DecoderLanguage Modeling	CodeCode Available	11
Deepfake Technology Unveiled: The Commoditization of AI and Its Impact on Digital Trust	Jan 24, 2025	Face SwappingMisinformation	—Unverified	0
Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement	Jan 15, 2025	Computational EfficiencyCPU	—Unverified	0
MARS6: A Small and Robust Hierarchical-Codec Text-to-Speech Model	Jan 10, 2025	DecoderLanguage Modelling	—Unverified	0
Advancing NAM-to-Speech Conversion with Novel Methods and the MultiNAM Dataset	Dec 25, 2024	text-to-speechText to Speech	—Unverified	0
Speech Watermarking with Discrete Intermediate Representations	Dec 18, 2024	Voice Cloning	—Unverified	0
Parallel Stacked Aggregated Network for Voice Authentication in IoT-Enabled Smart Devices	Nov 29, 2024	Voice Anti-spoofingVoice Cloning	—Unverified	0
Hindi audio-video-Deepfake (HAV-DF): A Hindi language-based Audio-video Deepfake Dataset	Nov 23, 2024	DeepFake DetectionFace Swapping	—Unverified	0
The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings	Oct 31, 2024	Voice Cloning	—Unverified	0
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis	Oct 30, 2024	Speech Synthesistext-to-speech	CodeCode Available	2
DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis	Oct 14, 2024	DenoisingSpeaker Verification	—Unverified	0
Can DeepFake Speech be Reliably Detected?	Oct 9, 2024	Face SwappingMisinformation	—Unverified	0
Algorithms For Automatic Accentuation And Transcription Of Russian Texts In Speech Recognition Systems	Oct 3, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control	Oct 1, 2024	Emotional Speech SynthesisSpeech Synthesis	CodeCode Available	2
Augmentation through Laundering Attacks for Audio Spoof Detection	Oct 1, 2024	Data AugmentationFace Swapping	—Unverified	0
LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation	Sep 23, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space	Sep 19, 2024	Automatic Speech RecognitionData Augmentation	—Unverified	0
Multi-modal Adversarial Training for Zero-Shot Voice Cloning	Aug 28, 2024	Decodertext-to-speech	—Unverified	0
Is Audio Spoof Detection Robust to Laundering Attacks?	Aug 27, 2024	Voice Cloning	CodeCode Available	0
kNN Retrieval for Simple and Effective Zero-Shot Multi-speaker Text-to-Speech	Aug 20, 2024	RetrievalSelf-Supervised Learning	—Unverified	0
Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language	Aug 19, 2024	Transfer LearningVoice Cloning	—Unverified	0
WavLM model ensemble for audio deepfake detection	Aug 14, 2024	Audio Deepfake DetectionData Augmentation	CodeCode Available	0
Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems	Jul 18, 2024	Speech-to-Speech TranslationVoice Cloning	—Unverified	0
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens	Jul 7, 2024	Language ModellingLarge Language Model	CodeCode Available	11
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs	Jul 4, 2024	Emotion RecognitionEvent Detection	CodeCode Available	11
A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge	Jun 22, 2024	Speech Synthesistext-to-speech	—Unverified	0
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing	Jun 13, 2024	Language ModelingLanguage Modelling	—Unverified	0
Spoken Language Corpora Augmentation with Domain-Specific Voice-Cloned Speech	Jun 11, 2024	speech-recognitionSpeech Recognition	—Unverified	0

Show:10 25 50

← PrevPage 1 of 3Next →

No leaderboard results yet.