SOTAVerified

Voice Cloning

Voice cloning is a highly desired feature for personalized speech interfaces. Neural voice cloning system learns to synthesize a person’s voice from only a few audio samples.

Papers

Showing 51100 of 112 papers

TitleStatusHype
Can DeepFake Speech be Reliably Detected?0
Algorithms For Automatic Accentuation And Transcription Of Russian Texts In Speech Recognition Systems0
Augmentation through Laundering Attacks for Audio Spoof Detection0
Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space0
Multi-modal Adversarial Training for Zero-Shot Voice Cloning0
Is Audio Spoof Detection Robust to Laundering Attacks?Code0
kNN Retrieval for Simple and Effective Zero-Shot Multi-speaker Text-to-Speech0
Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language0
WavLM model ensemble for audio deepfake detectionCode0
Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems0
A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge0
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing0
Spoken Language Corpora Augmentation with Domain-Specific Voice-Cloned Speech0
Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices0
Non-autoregressive real-time Accent Conversion model with voice cloning0
PolyGlotFake: A Novel Multilingual and Multimodal DeepFake DatasetCode0
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech0
Scaling NVIDIA's Multi-speaker Multi-lingual TTS Systems with Zero-Shot TTS to Indic Languages0
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis0
MemoryCompanion: A Smart Healthcare Solution to Empower Efficient Alzheimer's Care Via Unleashing Generative AI0
Learning Through AI-Clones: Enhancing Self-Perception and Presentation Performance0
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models0
Collaborative Watermarking for Adversarial Speech Synthesis0
TRAVID: An End-to-End Video Translation Framework0
Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion0
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis0
Taiwanese-Accented Mandarin and English Multi-Speaker Talking-Face Synthesis System0
Low-Resource Multilingual and Zero-Shot Multispeaker TTSCode0
Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech SynthesisCode0
Mix and Match: An Empirical Study on Training Corpus Composition for Polyglot Text-To-Speech (TTS)0
Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE0
Dictionary Attacks on Speaker VerificationCode0
Self-supervised learning for robust voice cloning0
Improve few-shot voice cloning using multi-modal learning0
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention0
V2C: Visual Voice Cloning0
Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning0
SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and MachinesCode0
Revisiting IPA-based Cross-lingual Text-to-speech0
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data0
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech0
Adapting TTS models For New Speakers using Transfer Learning0
Discovery of Single Independent Latent VariableCode0
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation0
AI based Presentation Creator With Customized Audio Content Delivery0
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance0
The AS-NU System for the M2VoC Challenge0
The Multi-speaker Multi-style Voice Cloning Challenge 20210
CUHK-EE Voice Cloning System for ICASSP 2021 M2VoC Challenge0
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-SpeechCode0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.