Voice Cloning

Voice cloning is a highly desired feature for personalized speech interfaces. Neural voice cloning system learns to synthesize a person’s voice from only a few audio samples.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 112 papers

Title	Date	Tasks	Status	Hype
Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices	Jun 11, 2024	EthicsFairness	—Unverified	0
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model	Jun 7, 2024	text-to-speechText to Speech	CodeCode Available	1
Small-E: Small Language Model with Linear Attention for Efficient Speech Synthesis	Jun 6, 2024	DecoderInductive Bias	CodeCode Available	2
Non-autoregressive real-time Accent Conversion model with voice cloning	May 21, 2024	Speech Enhancementspeech-recognition	—Unverified	0
PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset	May 14, 2024	DeepFake DetectionFace Swapping	CodeCode Available	0
StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing	Feb 20, 2024	Voice Cloning	CodeCode Available	2
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech	Feb 14, 2024	DecoderGPU	—Unverified	0
Proactive Detection of Voice Cloning with Localized Watermarking	Jan 30, 2024	Voice Cloning	CodeCode Available	4
Scaling NVIDIA's Multi-speaker Multi-lingual TTS Systems with Zero-Shot TTS to Indic Languages	Jan 24, 2024	Voice Cloning	—Unverified	0
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis	Jan 22, 2024	Speaker VerificationSpeech Synthesis	—Unverified	0
OpenVoice: Versatile Instant Voice Cloning	Dec 3, 2023	RhythmVoice Cloning	CodeCode Available	7
MemoryCompanion: A Smart Healthcare Solution to Empower Efficient Alzheimer's Care Via Unleashing Generative AI	Nov 20, 2023	ChatbotPrompt Engineering	—Unverified	0
Learning Through AI-Clones: Enhancing Self-Perception and Presentation Performance	Oct 23, 2023	Face SwappingVoice Cloning	—Unverified	0
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models	Sep 27, 2023	AllSpeech Synthesis	—Unverified	0
Collaborative Watermarking for Adversarial Speech Synthesis	Sep 26, 2023	Speaker VerificationSpeech Synthesis	—Unverified	0
TRAVID: An End-to-End Video Translation Framework	Sep 20, 2023	TranslationVoice Cloning	—Unverified	0
Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion	Aug 24, 2023	Audio ClassificationBinary Classification	—Unverified	0
Anonymizing Speech: Evaluating and Designing Speaker Anonymization Techniques	Aug 5, 2023	QuantizationSpeaker anonymization	CodeCode Available	1
Single and Multi-Speaker Cloned Voice Detection: From Perceptual to Learned Features	Jul 15, 2023	Voice Cloning	CodeCode Available	1
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis	Jul 14, 2023	In-Context LearningLanguage Modelling	—Unverified	0
Enhancing Suno's Bark Text-to-Speech Model: Addressing Limitations Through Meta's Encodec and Pre-Trained Hubert	Apr 18, 2023	Audio GenerationExpressive Speech Synthesis	CodeCode Available	4
ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech	Nov 7, 2022	Representation LearningSpeech Representation Learning	CodeCode Available	6
Taiwanese-Accented Mandarin and English Multi-Speaker Talking-Face Synthesis System	Nov 1, 2022	Face GenerationSpeech Synthesis	—Unverified	0
Low-Resource Multilingual and Zero-Shot Multispeaker TTS	Oct 21, 2022	Meta-Learningtext-to-speech	—Unverified	0
Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis	Oct 14, 2022	Speech SynthesisVoice Cloning	CodeCode Available	0
Mix and Match: An Empirical Study on Training Corpus Composition for Polyglot Text-To-Speech (TTS)	Jul 4, 2022	Speech Synthesistext-to-speech	—Unverified	0
Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE	Jun 6, 2022	Representation LearningSpeech Representation Learning	—Unverified	0
Dictionary Attacks on Speaker Verification	Apr 24, 2022	Speaker VerificationVoice Cloning	CodeCode Available	0
Self-supervised learning for robust voice cloning	Apr 7, 2022	Self-Supervised LearningSpeech Synthesis	—Unverified	0
Improve few-shot voice cloning using multi-modal learning	Mar 18, 2022	text-to-speechText to Speech	—Unverified	0
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention	Jan 25, 2022	FormSpeech Synthesis	—Unverified	0
V2C: Visual Voice Cloning	Nov 25, 2021	Voice Cloning	—Unverified	0
Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning	Nov 14, 2021	DisentanglementMeta-Learning	—Unverified	0
SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and Machines	Nov 6, 2021	DisentanglementSpeaker Verification	CodeCode Available	0
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech	Oct 14, 2021	Disentanglementtext-to-speech	—Unverified	0
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data	Oct 14, 2021	text-to-speechText to Speech	—Unverified	0
Revisiting IPA-based Cross-lingual Text-to-speech	Oct 14, 2021	text-to-speechText to Speech	—Unverified	0
Discovery of Single Independent Latent Variable	Oct 12, 2021	Image GenerationVoice Cloning	CodeCode Available	0
Adapting TTS models For New Speakers using Transfer Learning	Oct 12, 2021	text-to-speechText to Speech	—Unverified	0
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation	Jul 19, 2021	Data AugmentationDecoder	—Unverified	0
AI based Presentation Creator With Customized Audio Content Delivery	Jun 27, 2021	Generative Adversarial NetworkVoice Cloning	—Unverified	0
Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via Text	Jun 26, 2021	Talking Face GenerationTalking Head Generation	CodeCode Available	1
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance	Jun 25, 2021	QuantizationSpeaker anonymization	—Unverified	0
Building Bilingual and Code-Switched Voice Conversion with Limited Training Data Using Embedding Consistency Loss	Apr 22, 2021	Voice CloningVoice Conversion	CodeCode Available	1
The AS-NU System for the M2VoC Challenge	Apr 7, 2021	Voice Cloning	—Unverified	0
The Multi-speaker Multi-style Voice Cloning Challenge 2021	Apr 5, 2021	BenchmarkingVoice Cloning	—Unverified	0
CUHK-EE Voice Cloning System for ICASSP 2021 M2VoC Challenge	Mar 8, 2021	Voice Cloning	—Unverified	0
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech	Mar 6, 2021	text-to-speechText to Speech	—Unverified	0
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning	Feb 10, 2021	Speech Synthesistext-to-speech	—Unverified	0
Expressive Neural Voice Cloning	Jan 30, 2021	Speech SynthesisStyle Transfer	—Unverified	0

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.