Voice Cloning

Voice cloning is a highly desired feature for personalized speech interfaces. Neural voice cloning system learns to synthesize a person’s voice from only a few audio samples.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 112 papers

Title	Date	Tasks	Status
Can DeepFake Speech be Reliably Detected?	Oct 9, 2024	Face SwappingMisinformation	—Unverified
Algorithms For Automatic Accentuation And Transcription Of Russian Texts In Speech Recognition Systems	Oct 3, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Augmentation through Laundering Attacks for Audio Spoof Detection	Oct 1, 2024	Data AugmentationFace Swapping	—Unverified
Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space	Sep 19, 2024	Automatic Speech RecognitionData Augmentation	—Unverified
Multi-modal Adversarial Training for Zero-Shot Voice Cloning	Aug 28, 2024	Decodertext-to-speech	—Unverified
Is Audio Spoof Detection Robust to Laundering Attacks?	Aug 27, 2024	Voice Cloning	CodeCode Available
kNN Retrieval for Simple and Effective Zero-Shot Multi-speaker Text-to-Speech	Aug 20, 2024	RetrievalSelf-Supervised Learning	—Unverified
Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language	Aug 19, 2024	Transfer LearningVoice Cloning	—Unverified
WavLM model ensemble for audio deepfake detection	Aug 14, 2024	Audio Deepfake DetectionData Augmentation	CodeCode Available
Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems	Jul 18, 2024	Speech-to-Speech TranslationVoice Cloning	—Unverified
A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge	Jun 22, 2024	Speech Synthesistext-to-speech	—Unverified
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing	Jun 13, 2024	Language ModelingLanguage Modelling	—Unverified
Spoken Language Corpora Augmentation with Domain-Specific Voice-Cloned Speech	Jun 11, 2024	speech-recognitionSpeech Recognition	—Unverified
Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices	Jun 11, 2024	EthicsFairness	—Unverified
Non-autoregressive real-time Accent Conversion model with voice cloning	May 21, 2024	Speech Enhancementspeech-recognition	—Unverified
PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset	May 14, 2024	DeepFake DetectionFace Swapping	CodeCode Available
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech	Feb 14, 2024	DecoderGPU	—Unverified
Scaling NVIDIA's Multi-speaker Multi-lingual TTS Systems with Zero-Shot TTS to Indic Languages	Jan 24, 2024	Voice Cloning	—Unverified
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis	Jan 22, 2024	Speaker VerificationSpeech Synthesis	—Unverified
MemoryCompanion: A Smart Healthcare Solution to Empower Efficient Alzheimer's Care Via Unleashing Generative AI	Nov 20, 2023	ChatbotPrompt Engineering	—Unverified
Learning Through AI-Clones: Enhancing Self-Perception and Presentation Performance	Oct 23, 2023	Face SwappingVoice Cloning	—Unverified
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models	Sep 27, 2023	AllSpeech Synthesis	—Unverified
Collaborative Watermarking for Adversarial Speech Synthesis	Sep 26, 2023	Speaker VerificationSpeech Synthesis	—Unverified
TRAVID: An End-to-End Video Translation Framework	Sep 20, 2023	TranslationVoice Cloning	—Unverified
Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion	Aug 24, 2023	Audio ClassificationBinary Classification	—Unverified
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis	Jul 14, 2023	In-Context LearningLanguage Modelling	—Unverified
Taiwanese-Accented Mandarin and English Multi-Speaker Talking-Face Synthesis System	Nov 1, 2022	Face GenerationSpeech Synthesis	—Unverified
Low-Resource Multilingual and Zero-Shot Multispeaker TTS	Oct 21, 2022	Meta-Learningtext-to-speech	—Unverified
Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis	Oct 14, 2022	Speech SynthesisVoice Cloning	CodeCode Available
Mix and Match: An Empirical Study on Training Corpus Composition for Polyglot Text-To-Speech (TTS)	Jul 4, 2022	Speech Synthesistext-to-speech	—Unverified
Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE	Jun 6, 2022	Representation LearningSpeech Representation Learning	—Unverified
Dictionary Attacks on Speaker Verification	Apr 24, 2022	Speaker VerificationVoice Cloning	CodeCode Available
Self-supervised learning for robust voice cloning	Apr 7, 2022	Self-Supervised LearningSpeech Synthesis	—Unverified
Improve few-shot voice cloning using multi-modal learning	Mar 18, 2022	text-to-speechText to Speech	—Unverified
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention	Jan 25, 2022	FormSpeech Synthesis	—Unverified
V2C: Visual Voice Cloning	Nov 25, 2021	Voice Cloning	—Unverified
Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning	Nov 14, 2021	DisentanglementMeta-Learning	—Unverified
SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and Machines	Nov 6, 2021	DisentanglementSpeaker Verification	CodeCode Available
Revisiting IPA-based Cross-lingual Text-to-speech	Oct 14, 2021	text-to-speechText to Speech	—Unverified
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data	Oct 14, 2021	text-to-speechText to Speech	—Unverified
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech	Oct 14, 2021	Disentanglementtext-to-speech	—Unverified
Adapting TTS models For New Speakers using Transfer Learning	Oct 12, 2021	text-to-speechText to Speech	—Unverified
Discovery of Single Independent Latent Variable	Oct 12, 2021	Image GenerationVoice Cloning	CodeCode Available
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation	Jul 19, 2021	Data AugmentationDecoder	—Unverified
AI based Presentation Creator With Customized Audio Content Delivery	Jun 27, 2021	Generative Adversarial NetworkVoice Cloning	—Unverified
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance	Jun 25, 2021	QuantizationSpeaker anonymization	—Unverified
The AS-NU System for the M2VoC Challenge	Apr 7, 2021	Voice Cloning	—Unverified
The Multi-speaker Multi-style Voice Cloning Challenge 2021	Apr 5, 2021	BenchmarkingVoice Cloning	—Unverified
CUHK-EE Voice Cloning System for ICASSP 2021 M2VoC Challenge	Mar 8, 2021	Voice Cloning	—Unverified
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech	Mar 6, 2021	text-to-speechText to Speech	—Unverified

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.