Voice Conversion

I remember all the summer days Drinking wine in the sunshine I hope it never leaves And I remember all the summer nights Staring at you in the moonlight I hope you never leave 'cause baby You're so good to me You have all that all that I ever need It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you I remember all those winter days frozen In the cold tryin' to get you home Should I be moving in, we can be together then Remember spending all those winter nights Stayin' inside by the warm fire Yeah you gotta know that I can never let you go You and I have the rest of our lives to say It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you Can anybody else see it? Mm, can anybody else see what I do? Can anybody else feel it? Oh, can anybody else feel the way I do? But now I'm with you Hard to forget all the moments when We'd be sitting there hoping it would never end 'Cause this is meant to be So baby, will you marry me? It's easy to love you So easy to love you Ooh, you know it's true The best part of being with you To know you are with me It's not so hard to say It's easy to love you You and me will be together I know our love will last forever You and me will be together I know our love will last forever You know it's true The best part of being with you You're easy to love

Source: Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 520 papers

Title	Date	Tasks	Status	Score
Spoof detection using time-delay shallow neural network and feature switching	Apr 16, 2019	Speaker VerificationSpeech Synthesis	CodeCode Available	5
Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis	Feb 28, 2020	Speech Synthesistext-to-speech	CodeCode Available	5
StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion	Jul 29, 2019	Voice Conversion	CodeCode Available	5
Epoch-Synchronous Overlap-Add (ESOLA) for Time- and Pitch-Scale Modification of Speech Signals	Jan 19, 2018	Speech SynthesisVoice Conversion	CodeCode Available	5
SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and Machines	Nov 6, 2021	DisentanglementSpeaker Verification	CodeCode Available	5
Scalable Factorized Hierarchical Variational Autoencoder Training	Apr 9, 2018	DisentanglementHyperparameter Optimization	CodeCode Available	5
Emotional Voice Conversion using Multitask Learning with Text-to-speech	Nov 11, 2019	Decodertext-to-speech	CodeCode Available	5
Private kNN-VC: Interpretable Anonymization of Converted Speech	May 23, 2025	Speaker anonymizationSpeaker Recognition	CodeCode Available	5
Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation	Dec 8, 2020	AttributeDisentanglement	CodeCode Available	5
Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts	May 10, 2022	Speech SynthesisVoice Conversion	CodeCode Available	5
Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks	Nov 30, 2017	Voice Conversion	CodeCode Available	5
Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge	Feb 18, 2025	Voice Conversion	CodeCode Available	5
NVC-Net: End-to-End Adversarial Voice Conversion	Jun 2, 2021	GPUSpeech Synthesis	CodeCode Available	5
Multi-task learning improves synthetic speech detection	Apr 27, 2022	Multi-Task LearningSpeaker Verification	CodeCode Available	5
Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations	Apr 9, 2018	DecoderVoice Conversion	CodeCode Available	5
Non-Parallel Training Approach for Emotional Voice Conversion Using CycleGAN	Nov 1, 2023	Voice Conversion	CodeCode Available	5
Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion	May 25, 2023	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	5
MelGAN-VC: Voice Conversion and Audio Style Transfer on arbitrarily long samples using Spectrograms	Oct 8, 2019	Generative Adversarial NetworkMusic Style Transfer	CodeCode Available	5
Mel-spectrogram augmentation for sequence to sequence voice conversion	Jan 6, 2020	Voice Conversion	CodeCode Available	5
Anonymising Elderly and Pathological Speech: Voice Conversion Using DDSP and Query-by-Example	Oct 20, 2024	Voice Conversion	CodeCode Available	5
Hear Your Face: Face-based voice conversion with F0 estimation	Aug 19, 2024	Voice Conversion	CodeCode Available	5
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints	Nov 16, 2022	Voice Conversion	—Unverified	0
A Unified Model For Voice and Accent Conversion In Speech and Singing using Self-Supervised Learning and Feature Extraction	Dec 11, 2024	DecoderSelf-Supervised Learning	—Unverified	0
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices	Aug 15, 2020	Speaker RecognitionVoice Conversion	—Unverified	0
Audio Deep Fake Detection System with Neural Stitching for ADD 2022	Apr 19, 2022	text-to-speechText to Speech	—Unverified	0
Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling	Aug 9, 2020	Deep LearningSpeech Synthesis	—Unverified	0
Deep Learning-based F0 Synthesis for Speaker Anonymization	Jun 29, 2023	Deep LearningSpeaker anonymization	—Unverified	0
Audio Anti-spoofing Using a Simple Attention Module and Joint Optimization Based on Additive Angular Margin Loss and Meta-learning	Nov 17, 2022	Binary ClassificationMeta-Learning	—Unverified	0
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR	Mar 11, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Adaptive Speech Duration Modification using a Deep-Generative Framework	Sep 29, 2021	DecoderDynamic Time Warping	—Unverified	0
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding	Oct 13, 2021	Speech SynthesisVoice Conversion	—Unverified	0
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms	Nov 9, 2018	GPUImage Captioning	—Unverified	0
Attentive activation function for improving end-to-end spoofing countermeasure systems	May 3, 2022	Speech SynthesisVoice Conversion	—Unverified	0
Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE	Mar 28, 2022	Speech SynthesisVoice Conversion	—Unverified	0
D-CAPTCHA++: A Study of Resilience of Deepfake CAPTCHA under Transferable Imperceptible Adversarial Attack	Sep 11, 2024	Adversarial AttackAudio Synthesis	—Unverified	0
Data Augmentation for Diverse Voice Conversion in Noisy Environments	May 18, 2023	Data AugmentationDecoder	—Unverified	0
Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion	Dec 29, 2023	Contrastive LearningDisentanglement	—Unverified	0
ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech	Feb 13, 2025	Adversarial AttackAdversarial Attack Detection	—Unverified	0
An Adaptive Learning based Generative Adversarial Network for One-To-One Voice Conversion	Apr 25, 2021	Generative Adversarial NetworkSpeech Synthesis	—Unverified	0
ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations	Feb 16, 2023	Self-Supervised LearningSpeaker Verification	—Unverified	0
CycleFlow: Purify Information Factors by Cycle Loss	Oct 18, 2021	Voice Conversion	—Unverified	0
ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech	Feb 11, 2021	Speaker VerificationSpeech Synthesis	—Unverified	0
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion	Nov 24, 2023	Data AugmentationRetrieval	—Unverified	0
CTEFM-VC: Zero-Shot Voice Conversion Based on Content-Aware Timbre Ensemble Modeling and Flow Matching	Nov 4, 2024	Speaker VerificationVoice Conversion	—Unverified	0
ALO-VC: Any-to-any Low-latency One-shot Voice Conversion	Jun 1, 2023	CPUVoice Conversion	—Unverified	0
Cross-speaker style transfer for text-to-speech using data augmentation	Feb 10, 2022	Data AugmentationStyle Transfer	—Unverified	0
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation	Apr 21, 2022	Data Augmentationtext-to-speech	—Unverified	0
Crossmodal Voice Conversion	Apr 9, 2019	DecoderVoice Conversion	—Unverified	0
A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment	Apr 23, 2018	BenchmarkingSpeaker Verification	—Unverified	0
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment	May 8, 2023	cross-modal alignmentRhythm	—Unverified	0

Show:10 25 50

← PrevPage 4 of 11Next →

All datasets ZeroSpeech 2019 English LibriSpeech test-clean VCTK

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	VQ-CPC	Speaker Similarity	3.8	—	Unverified
2	VQ-VAE	Speaker Similarity	3.49	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	kNN-VC (prematched HiFiGAN)	Character Error Rate (CER)	2.96	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DISSC	Total Length Error (TLE)	0.83	—	Unverified