SOTAVerified|Agents Browse Leaderboard About Blog

Voice Cloning

Voice cloning is a highly desired feature for personalized speech interfaces. Neural voice cloning system learns to synthesize a person’s voice from only a few audio samples.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 112 papers

Title	Date	Tasks	Status	Score
Is Audio Spoof Detection Robust to Laundering Attacks?	Aug 27, 2024	Voice Cloning	CodeCode Available	5
WavLM model ensemble for audio deepfake detection	Aug 14, 2024	Audio Deepfake DetectionData Augmentation	CodeCode Available	5
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis	Jun 12, 2018	Speaker VerificationSpeech Synthesis	CodeCode Available	5
SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development	Mar 31, 2025	Speech SynthesisVoice Cloning	CodeCode Available	5
SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and Machines	Nov 6, 2021	DisentanglementSpeaker Verification	CodeCode Available	5
Discovery of Single Independent Latent Variable	Oct 12, 2021	Image GenerationVoice Cloning	CodeCode Available	5
ClonEval: An Open Voice Cloning Benchmark	Apr 29, 2025	text-to-speechText to Speech	CodeCode Available	5
Neural Voice Cloning with a Few Samples	Feb 14, 2018	Speech SynthesisVoice Cloning	CodeCode Available	5
PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset	May 14, 2024	DeepFake DetectionFace Swapping	CodeCode Available	5
Empowering Global Voices: A Data-Efficient, Phoneme-Tone Adaptive Approach to High-Fidelity Speech Synthesis	Apr 10, 2025	Speech Synthesistext-to-speech	—Unverified	0
Can DeepFake Speech be Reliably Detected?	Oct 9, 2024	Face SwappingMisinformation	—Unverified	0
Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language	Aug 19, 2024	Transfer LearningVoice Cloning	—Unverified	0
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing	Jun 13, 2024	Language ModelingLanguage Modelling	—Unverified	0
DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis	Oct 14, 2024	DenoisingSpeaker Verification	—Unverified	0
Beyond Face Swapping: A Diffusion-Based Digital Human Benchmark for Multimodal Deepfake Detection	May 22, 2025	DeepFake DetectionFace Swapping	—Unverified	0
Latent linguistic embedding for cross-lingual text-to-speech and voice conversion	Oct 8, 2020	text-to-speechText to Speech	—Unverified	0
Deepfake Technology Unveiled: The Commoditization of AI and Its Impact on Digital Trust	Jan 24, 2025	Face SwappingMisinformation	—Unverified	0
Augmentation through Laundering Attacks for Audio Spoof Detection	Oct 1, 2024	Data AugmentationFace Swapping	—Unverified	0
Advancing NAM-to-Speech Conversion with Novel Methods and the MultiNAM Dataset	Dec 25, 2024	text-to-speechText to Speech	—Unverified	0
Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices	Jun 11, 2024	EthicsFairness	—Unverified	0
"It's not a representation of me": Examining Accent Bias and Digital Exclusion in Synthetic AI Voice Services	Apr 12, 2025	Voice Cloning	—Unverified	0
De-AntiFake: Rethinking the Protective Perturbations Against Voice Cloning Attacks	Jul 3, 2025	Voice Cloning	—Unverified	0
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech	Mar 6, 2021	text-to-speechText to Speech	—Unverified	0
Data Efficient Voice Cloning for Neural Singing Synthesis	Feb 19, 2019	text-to-speechText to Speech	—Unverified	0
Improve few-shot voice cloning using multi-modal learning	Mar 18, 2022	text-to-speechText to Speech	—Unverified	0

Show:10 25 50

← PrevPage 2 of 5Next →

No leaderboard results yet.