Voice Conversion

I remember all the summer days Drinking wine in the sunshine I hope it never leaves And I remember all the summer nights Staring at you in the moonlight I hope you never leave 'cause baby You're so good to me You have all that all that I ever need It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you I remember all those winter days frozen In the cold tryin' to get you home Should I be moving in, we can be together then Remember spending all those winter nights Stayin' inside by the warm fire Yeah you gotta know that I can never let you go You and I have the rest of our lives to say It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you Can anybody else see it? Mm, can anybody else see what I do? Can anybody else feel it? Oh, can anybody else feel the way I do? But now I'm with you Hard to forget all the moments when We'd be sitting there hoping it would never end 'Cause this is meant to be So baby, will you marry me? It's easy to love you So easy to love you Ooh, you know it's true The best part of being with you To know you are with me It's not so hard to say It's easy to love you You and me will be together I know our love will last forever You and me will be together I know our love will last forever You know it's true The best part of being with you You're easy to love

Source: Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 520 papers

Title	Date	Tasks	Status	Hype
HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods	Sep 15, 2023	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	1
Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion	Jan 22, 2020	DisentanglementVoice Conversion	CodeCode Available	1
Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline	Nov 29, 2022	Voice Conversion	CodeCode Available	1
Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders	Aug 29, 2018	Voice Conversion	CodeCode Available	1
Anonymizing Speech: Evaluating and Designing Speaker Anonymization Techniques	Aug 5, 2023	QuantizationSpeaker anonymization	CodeCode Available	1
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal Conversion	Nov 13, 2023	Contrastive LearningEEG	CodeCode Available	1
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training	Mar 31, 2021	text-to-speechText to Speech	CodeCode Available	1
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion	Jun 18, 2021	DisentanglementQuantization	CodeCode Available	1
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection	Dec 15, 2023	Audio Deepfake DetectionContinual Learning	CodeCode Available	1
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone	Dec 4, 2021	Speech SynthesisText-To-Speech Synthesis	CodeCode Available	1
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion	Oct 22, 2020	Voice Conversion	CodeCode Available	1
CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer	Nov 30, 2021	Voice Conversion	CodeCode Available	1
AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform	Dec 17, 2023	Image SegmentationSegmentation	CodeCode Available	1
MOSNet: Deep Learning based Objective Assessment for Voice Conversion	Apr 17, 2019	Deep LearningVoice Conversion	CodeCode Available	1
FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection	Oct 18, 2021	Speech SynthesisSynthetic Speech Detection	CodeCode Available	1
FSD: An Initial Chinese Dataset for Fake Song Detection	Sep 5, 2023	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	1
F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder	Apr 15, 2020	Style TransferVoice Conversion	CodeCode Available	1
Deep Learning Based Assessment of Synthetic Speech Naturalness	Apr 23, 2021	Deep LearningPrediction	CodeCode Available	1
Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion	Sep 5, 2023	Voice Conversion	CodeCode Available	1
FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation	Nov 11, 2020	Voice Conversion	CodeCode Available	1
GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models	Oct 11, 2022	DisentanglementGenerative Adversarial Network	CodeCode Available	1
Low-Latency Real-Time Non-Parallel Voice Conversion based on Cyclic Variational Autoencoder and Multiband WaveRNN with Data-Driven Linear Prediction	May 20, 2021	CPUVoice Conversion	CodeCode Available	1
Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph	Feb 24, 2022	DecoderQuantization	CodeCode Available	1
Defending Your Voice: Adversarial Attack on Voice Conversion	May 18, 2020	Adversarial AttackVoice Conversion	CodeCode Available	1
Toward Degradation-Robust Voice Conversion	Oct 14, 2021	DenoisingSpeech Enhancement	CodeCode Available	1
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints	Nov 16, 2022	Voice Conversion	—Unverified	0
A Unified Model For Voice and Accent Conversion In Speech and Singing using Self-Supervised Learning and Feature Extraction	Dec 11, 2024	DecoderSelf-Supervised Learning	—Unverified	0
Adaptive Speech Duration Modification using a Deep-Generative Framework	Sep 29, 2021	DecoderDynamic Time Warping	—Unverified	0
Emotion Intensity and its Control for Emotional Voice Conversion	Jan 10, 2022	Emotion ClassificationVoice Conversion	—Unverified	0
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices	Aug 15, 2020	Speaker RecognitionVoice Conversion	—Unverified	0
Audio Deep Fake Detection System with Neural Stitching for ADD 2022	Apr 19, 2022	text-to-speechText to Speech	—Unverified	0
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR	Mar 11, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling	Aug 9, 2020	Deep LearningSpeech Synthesis	—Unverified	0
Deep Learning-based F0 Synthesis for Speaker Anonymization	Jun 29, 2023	Deep LearningSpeaker anonymization	—Unverified	0
Audio Anti-spoofing Using a Simple Attention Module and Joint Optimization Based on Additive Angular Margin Loss and Meta-learning	Nov 17, 2022	Binary ClassificationMeta-Learning	—Unverified	0
Many-to-Many Voice Conversion with Out-of-Dataset Speaker Support	Apr 30, 2019	Speaker IdentificationVoice Conversion	—Unverified	0
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding	Oct 13, 2021	Speech SynthesisVoice Conversion	—Unverified	0
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms	Nov 9, 2018	GPUImage Captioning	—Unverified	0
Attentive activation function for improving end-to-end spoofing countermeasure systems	May 3, 2022	Speech SynthesisVoice Conversion	—Unverified	0
Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE	Mar 28, 2022	Speech SynthesisVoice Conversion	—Unverified	0
ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations	Feb 16, 2023	Self-Supervised LearningSpeaker Verification	—Unverified	0
D-CAPTCHA++: A Study of Resilience of Deepfake CAPTCHA under Transferable Imperceptible Adversarial Attack	Sep 11, 2024	Adversarial AttackAudio Synthesis	—Unverified	0
Data Augmentation for Diverse Voice Conversion in Noisy Environments	May 18, 2023	Data AugmentationDecoder	—Unverified	0
Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion	Dec 29, 2023	Contrastive LearningDisentanglement	—Unverified	0
ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech	Feb 13, 2025	Adversarial AttackAdversarial Attack Detection	—Unverified	0
An Adaptive Learning based Generative Adversarial Network for One-To-One Voice Conversion	Apr 25, 2021	Generative Adversarial NetworkSpeech Synthesis	—Unverified	0
End-to-End Voice Conversion with Information Perturbation	Jun 15, 2022	Voice Conversion	—Unverified	0
ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech	Feb 11, 2021	Speaker VerificationSpeech Synthesis	—Unverified	0
CycleFlow: Purify Information Factors by Cycle Loss	Oct 18, 2021	Voice Conversion	—Unverified	0
AC-VC: Non-parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion	Nov 12, 2021	Voice Conversion	—Unverified	0

Show:10 25 50

← PrevPage 3 of 11Next →

All datasets ZeroSpeech 2019 English LibriSpeech test-clean VCTK

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	VQ-CPC	Speaker Similarity	3.8	—	Unverified
2	VQ-VAE	Speaker Similarity	3.49	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	kNN-VC (prematched HiFiGAN)	Character Error Rate (CER)	2.96	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DISSC	Total Length Error (TLE)	0.83	—	Unverified