SOTAVerified

Voice Cloning

Voice cloning is a highly desired feature for personalized speech interfaces. Neural voice cloning system learns to synthesize a person’s voice from only a few audio samples.

Papers

Showing 51100 of 112 papers

TitleStatusHype
Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices0
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech ModelCode1
Small-E: Small Language Model with Linear Attention for Efficient Speech SynthesisCode2
Non-autoregressive real-time Accent Conversion model with voice cloning0
PolyGlotFake: A Novel Multilingual and Multimodal DeepFake DatasetCode0
StyleDubber: Towards Multi-Scale Style Learning for Movie DubbingCode2
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech0
Proactive Detection of Voice Cloning with Localized WatermarkingCode4
Scaling NVIDIA's Multi-speaker Multi-lingual TTS Systems with Zero-Shot TTS to Indic Languages0
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis0
OpenVoice: Versatile Instant Voice CloningCode7
MemoryCompanion: A Smart Healthcare Solution to Empower Efficient Alzheimer's Care Via Unleashing Generative AI0
Learning Through AI-Clones: Enhancing Self-Perception and Presentation Performance0
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models0
Collaborative Watermarking for Adversarial Speech Synthesis0
TRAVID: An End-to-End Video Translation Framework0
Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion0
Anonymizing Speech: Evaluating and Designing Speaker Anonymization TechniquesCode1
Single and Multi-Speaker Cloned Voice Detection: From Perceptual to Learned FeaturesCode1
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis0
Enhancing Suno's Bark Text-to-Speech Model: Addressing Limitations Through Meta's Encodec and Pre-Trained HubertCode4
ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-SpeechCode6
Taiwanese-Accented Mandarin and English Multi-Speaker Talking-Face Synthesis System0
Low-Resource Multilingual and Zero-Shot Multispeaker TTSCode0
Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech SynthesisCode0
Mix and Match: An Empirical Study on Training Corpus Composition for Polyglot Text-To-Speech (TTS)0
Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE0
Dictionary Attacks on Speaker VerificationCode0
Self-supervised learning for robust voice cloning0
Improve few-shot voice cloning using multi-modal learning0
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention0
V2C: Visual Voice Cloning0
Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning0
SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and MachinesCode0
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech0
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data0
Revisiting IPA-based Cross-lingual Text-to-speech0
Discovery of Single Independent Latent VariableCode0
Adapting TTS models For New Speakers using Transfer Learning0
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation0
AI based Presentation Creator With Customized Audio Content Delivery0
Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via TextCode1
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance0
Building Bilingual and Code-Switched Voice Conversion with Limited Training Data Using Embedding Consistency LossCode1
The AS-NU System for the M2VoC Challenge0
The Multi-speaker Multi-style Voice Cloning Challenge 20210
CUHK-EE Voice Cloning System for ICASSP 2021 M2VoC Challenge0
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-SpeechCode0
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning0
Expressive Neural Voice Cloning0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.