SOTAVerified

Voice Cloning

Voice cloning is a highly desired feature for personalized speech interfaces. Neural voice cloning system learns to synthesize a person’s voice from only a few audio samples.

Papers

Showing 5175 of 112 papers

TitleStatusHype
Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices0
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech ModelCode1
Small-E: Small Language Model with Linear Attention for Efficient Speech SynthesisCode2
Non-autoregressive real-time Accent Conversion model with voice cloning0
PolyGlotFake: A Novel Multilingual and Multimodal DeepFake DatasetCode0
StyleDubber: Towards Multi-Scale Style Learning for Movie DubbingCode2
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech0
Proactive Detection of Voice Cloning with Localized WatermarkingCode4
Scaling NVIDIA's Multi-speaker Multi-lingual TTS Systems with Zero-Shot TTS to Indic Languages0
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis0
OpenVoice: Versatile Instant Voice CloningCode7
MemoryCompanion: A Smart Healthcare Solution to Empower Efficient Alzheimer's Care Via Unleashing Generative AI0
Learning Through AI-Clones: Enhancing Self-Perception and Presentation Performance0
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models0
Collaborative Watermarking for Adversarial Speech Synthesis0
TRAVID: An End-to-End Video Translation Framework0
Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion0
Anonymizing Speech: Evaluating and Designing Speaker Anonymization TechniquesCode1
Single and Multi-Speaker Cloned Voice Detection: From Perceptual to Learned FeaturesCode1
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis0
Enhancing Suno's Bark Text-to-Speech Model: Addressing Limitations Through Meta's Encodec and Pre-Trained HubertCode4
ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-SpeechCode6
Taiwanese-Accented Mandarin and English Multi-Speaker Talking-Face Synthesis System0
Low-Resource Multilingual and Zero-Shot Multispeaker TTSCode0
Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech SynthesisCode0
Show:102550
← PrevPage 3 of 5Next →

No leaderboard results yet.