Voice Conversion

I remember all the summer days Drinking wine in the sunshine I hope it never leaves And I remember all the summer nights Staring at you in the moonlight I hope you never leave 'cause baby You're so good to me You have all that all that I ever need It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you I remember all those winter days frozen In the cold tryin' to get you home Should I be moving in, we can be together then Remember spending all those winter nights Stayin' inside by the warm fire Yeah you gotta know that I can never let you go You and I have the rest of our lives to say It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you Can anybody else see it? Mm, can anybody else see what I do? Can anybody else feel it? Oh, can anybody else feel the way I do? But now I'm with you Hard to forget all the moments when We'd be sitting there hoping it would never end 'Cause this is meant to be So baby, will you marry me? It's easy to love you So easy to love you Ooh, you know it's true The best part of being with you To know you are with me It's not so hard to say It's easy to love you You and me will be together I know our love will last forever You and me will be together I know our love will last forever You know it's true The best part of being with you You're easy to love

Source: Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 520 papers

Title	Date	Tasks	Status
AC-VC: Non-parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion	Nov 12, 2021	Voice Conversion	—Unverified
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion	Nov 24, 2023	Data AugmentationRetrieval	—Unverified
CTEFM-VC: Zero-Shot Voice Conversion Based on Content-Aware Timbre Ensemble Modeling and Flow Matching	Nov 4, 2024	Speaker VerificationVoice Conversion	—Unverified
ALO-VC: Any-to-any Low-latency One-shot Voice Conversion	Jun 1, 2023	CPUVoice Conversion	—Unverified
Automatic Voice Identification after Speech Resynthesis using PPG	Aug 5, 2024	ResynthesisSpeaker Verification	—Unverified
Cross-speaker style transfer for text-to-speech using data augmentation	Feb 10, 2022	Data AugmentationStyle Transfer	—Unverified
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation	Apr 21, 2022	Data Augmentationtext-to-speech	—Unverified
Crossmodal Voice Conversion	Apr 9, 2019	DecoderVoice Conversion	—Unverified
A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment	Apr 23, 2018	BenchmarkingSpeaker Verification	—Unverified
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment	May 8, 2023	cross-modal alignmentRhythm	—Unverified
High Fidelity Speech Regeneration with Application to Speech Enhancement	Jan 31, 2021	DenoisingSpeaker Separation	—Unverified
Cross-modal Face- and Voice-style Transfer	Feb 27, 2023	DiversityImage-to-Image Translation	—Unverified
Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation	Oct 31, 2022	DecoderDisentanglement	—Unverified
Cross-lingual Knowledge Distillation via Flow-based Voice Conversion for Robust Polyglot Text-To-Speech	Sep 15, 2023	Knowledge DistillationSpeech Synthesis	—Unverified
Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models	May 27, 2023	Speech SynthesisVoice Conversion	—Unverified
ArVoice: A Multi-Speaker Dataset for Arabic Speech Synthesis	May 26, 2025	DeepFake DetectionFace Swapping	—Unverified
A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion	Jun 28, 2022	Speaker RecognitionVoice Conversion	—Unverified
A Regression Model of Recurrent Deep Neural Networks for Noise Robust Estimation of the Fundamental Frequency Contour of Speech	May 8, 2018	Language IdentificationSpeech Synthesis	—Unverified
Generating and Detecting Various Types of Fake Image and Audio Content: A Review of Modern Deep Learning Technologies and Tools	Jan 7, 2025	Face SwappingVoice Conversion	—Unverified
AE-Flow: AutoEncoder Normalizing Flow	Dec 27, 2023	text-to-speechText to Speech	—Unverified
Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations	Aug 24, 2023	Representation LearningSpeech Synthesis	—Unverified
Learning Speech Representation From Contrastive Token-Acoustic Pretraining	Sep 1, 2023	Audio ClassificationAutomatic Speech Recognition	—Unverified
Generalizable Audio Deepfake Detection via Latent Space Refinement and Augmentation	Jan 24, 2025	Audio Deepfake DetectionDeepFake Detection	—Unverified
CO-VADA: A Confidence-Oriented Voice Augmentation Debiasing Approach for Fair Speech Emotion Recognition	Jun 6, 2025	Emotion RecognitionFairness	—Unverified
High-quality nonparallel voice conversion based on cycle-consistent adversarial network	Apr 2, 2018	Generative Adversarial NetworkImage-to-Image Translation	—Unverified
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion	Oct 20, 2021	DisentanglementVoice Conversion	—Unverified
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion	Nov 5, 2018	Speech EnhancementVoice Conversion	—Unverified
Are disentangled representations all you need to build speaker anonymization systems?	Aug 22, 2022	AllAutomatic Speech Recognition	—Unverified
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation	Sep 3, 2024	Voice Conversion	—Unverified
FastVC: Fast Voice Conversion with non-parallel data	Oct 8, 2020	Voice Conversion	—Unverified
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model	May 2, 2024	DenoisingEmotion Recognition	—Unverified
Adversarial Transformation of Spoofing Attacks for Voice Biometrics	Jan 4, 2022	Speaker VerificationVoice Conversion	—Unverified
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion	Jun 28, 2023	Backdoor AttackVoice Conversion	—Unverified
FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning	Apr 22, 2025	Deep LearningSpeaker Verification	—Unverified
Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos	Jun 9, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
SongBsAb: A Dual Prevention Approach against Singing Voice Conversion based Illegal Song Covers	Jan 30, 2024	Voice Conversion	—Unverified
FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks	Feb 6, 2025	ResynthesisVoice Conversion	—Unverified
Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment	Sep 18, 2023	Voice Conversion	—Unverified
Conditional Deep Hierarchical Variational Autoencoder for Voice Conversion	Dec 6, 2021	DecoderVoice Conversion	—Unverified
EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion	May 22, 2025	DecoderVoice Conversion	—Unverified
Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer	Jul 8, 2021	Emotion RecognitionSpeech Emotion Recognition	—Unverified
Comparison of Speech Representations for the MOS Prediction System	Jun 28, 2022	Self-Supervised Learningtext-to-speech	—Unverified
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion	Jun 2, 2021	Voice Conversion	—Unverified
Generalization of Spectrum Differential based Direct Waveform Modification for Voice Conversion	Jul 27, 2019	Voice Conversion	—Unverified
Adversarial speech for voice privacy protection from Personalized Speech generation	Jan 22, 2024	Speaker Verificationtext-to-speech	—Unverified
Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements	Apr 27, 2025	Generative Adversarial NetworkSpeech Synthesis	—Unverified
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features	Nov 9, 2022	DecoderVoice Conversion	—Unverified
Creating New Voices using Normalizing Flows	Dec 22, 2023	Speech Synthesistext-to-speech	—Unverified
GenVC: Self-Supervised Zero-Shot Voice Conversion	Feb 6, 2025	Voice Conversion	—Unverified
Exploring the Importance of F0 Trajectories for Speaker Anonymization using X-vectors and Neural Waveform Models	Oct 13, 2021	ResynthesisSpeaker anonymization	—Unverified

Show:10 25 50

← PrevPage 4 of 11Next →

All datasets ZeroSpeech 2019 English LibriSpeech test-clean VCTK

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	VQ-CPC	Speaker Similarity	3.8	—	Unverified
2	VQ-VAE	Speaker Similarity	3.49	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	kNN-VC (prematched HiFiGAN)	Character Error Rate (CER)	2.96	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DISSC	Total Length Error (TLE)	0.83	—	Unverified