SOTAVerified|Agents Browse Leaderboard About Blog

Voice Cloning

Voice cloning is a highly desired feature for personalized speech interfaces. Neural voice cloning system learns to synthesize a person’s voice from only a few audio samples.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 112 papers

Title	Date	Tasks	Status	Hype
MARS6: A Small and Robust Hierarchical-Codec Text-to-Speech Model	Jan 10, 2025	DecoderLanguage Modelling	—Unverified	0
Advancing NAM-to-Speech Conversion with Novel Methods and the MultiNAM Dataset	Dec 25, 2024	text-to-speechText to Speech	—Unverified	0
Speech Watermarking with Discrete Intermediate Representations	Dec 18, 2024	Voice Cloning	—Unverified	0
Parallel Stacked Aggregated Network for Voice Authentication in IoT-Enabled Smart Devices	Nov 29, 2024	Voice Anti-spoofingVoice Cloning	—Unverified	0
Hindi audio-video-Deepfake (HAV-DF): A Hindi language-based Audio-video Deepfake Dataset	Nov 23, 2024	DeepFake DetectionFace Swapping	—Unverified	0
The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings	Oct 31, 2024	Voice Cloning	—Unverified	0
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis	Oct 30, 2024	Speech Synthesistext-to-speech	CodeCode Available	2
DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis	Oct 14, 2024	DenoisingSpeaker Verification	—Unverified	0
Can DeepFake Speech be Reliably Detected?	Oct 9, 2024	Face SwappingMisinformation	—Unverified	0
Algorithms For Automatic Accentuation And Transcription Of Russian Texts In Speech Recognition Systems	Oct 3, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control	Oct 1, 2024	Emotional Speech SynthesisSpeech Synthesis	CodeCode Available	2
Augmentation through Laundering Attacks for Audio Spoof Detection	Oct 1, 2024	Data AugmentationFace Swapping	—Unverified	0
LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation	Sep 23, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space	Sep 19, 2024	Automatic Speech RecognitionData Augmentation	—Unverified	0
Multi-modal Adversarial Training for Zero-Shot Voice Cloning	Aug 28, 2024	Decodertext-to-speech	—Unverified	0
Is Audio Spoof Detection Robust to Laundering Attacks?	Aug 27, 2024	Voice Cloning	CodeCode Available	0
kNN Retrieval for Simple and Effective Zero-Shot Multi-speaker Text-to-Speech	Aug 20, 2024	RetrievalSelf-Supervised Learning	—Unverified	0
Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language	Aug 19, 2024	Transfer LearningVoice Cloning	—Unverified	0
WavLM model ensemble for audio deepfake detection	Aug 14, 2024	Audio Deepfake DetectionData Augmentation	CodeCode Available	0
Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems	Jul 18, 2024	Speech-to-Speech TranslationVoice Cloning	—Unverified	0
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens	Jul 7, 2024	Language ModellingLarge Language Model	CodeCode Available	11
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs	Jul 4, 2024	Emotion RecognitionEvent Detection	CodeCode Available	11
A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge	Jun 22, 2024	Speech Synthesistext-to-speech	—Unverified	0
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing	Jun 13, 2024	Language ModelingLanguage Modelling	—Unverified	0
Spoken Language Corpora Augmentation with Domain-Specific Voice-Cloned Speech	Jun 11, 2024	speech-recognitionSpeech Recognition	—Unverified	0

Show:10 25 50

← PrevPage 2 of 5Next →

No leaderboard results yet.