Voice Conversion

I remember all the summer days Drinking wine in the sunshine I hope it never leaves And I remember all the summer nights Staring at you in the moonlight I hope you never leave 'cause baby You're so good to me You have all that all that I ever need It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you I remember all those winter days frozen In the cold tryin' to get you home Should I be moving in, we can be together then Remember spending all those winter nights Stayin' inside by the warm fire Yeah you gotta know that I can never let you go You and I have the rest of our lives to say It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you Can anybody else see it? Mm, can anybody else see what I do? Can anybody else feel it? Oh, can anybody else feel the way I do? But now I'm with you Hard to forget all the moments when We'd be sitting there hoping it would never end 'Cause this is meant to be So baby, will you marry me? It's easy to love you So easy to love you Ooh, you know it's true The best part of being with you To know you are with me It's not so hard to say It's easy to love you You and me will be together I know our love will last forever You and me will be together I know our love will last forever You know it's true The best part of being with you You're easy to love

Source: Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 126–150 of 520 papers

Title	Date	Tasks	Status	Hype
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment	Jan 16, 2024	DisentanglementSelf-Supervised Learning	CodeCode Available	2
Transfer the linguistic representations from TTS to accent conversion with non-parallel data	Jan 7, 2024	text-to-speechText to Speech	—Unverified	0
StreamVC: Real-Time Low-Latency Voice Conversion	Jan 5, 2024	Speech SynthesisVoice Conversion	—Unverified	0
CoMoSVC: Consistency Model-based Singing Voice Conversion	Jan 3, 2024	GPUmodel	CodeCode Available	2
Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion	Dec 29, 2023	Contrastive LearningDisentanglement	—Unverified	0
AE-Flow: AutoEncoder Normalizing Flow	Dec 27, 2023	text-to-speechText to Speech	—Unverified	0
Exploring data augmentation in bias mitigation against non-native-accented speech	Dec 24, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Creating New Voices using Normalizing Flows	Dec 22, 2023	Speech Synthesistext-to-speech	—Unverified	0
AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform	Dec 17, 2023	Image SegmentationSegmentation	CodeCode Available	1
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection	Dec 15, 2023	Audio Deepfake DetectionContinual Learning	CodeCode Available	1
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention	Dec 14, 2023	PositionVoice Conversion	—Unverified	0
PerMod: Perceptually Grounded Voice Modification with Latent Diffusion Models	Dec 13, 2023	SentenceVoice Conversion	—Unverified	0
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes	Nov 29, 2023	Face RecognitionFace Swapping	—Unverified	0
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion	Nov 24, 2023	Data AugmentationRetrieval	—Unverified	0
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis	Nov 21, 2023	Speech SynthesisSuper-Resolution	CodeCode Available	3
Improving fairness for spoken language understanding in atypical speech with Text-to-Speech	Nov 16, 2023	Data AugmentationFairness	CodeCode Available	1
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion	Nov 14, 2023	Deep LearningDiversity	—Unverified	0
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal Conversion	Nov 13, 2023	Contrastive LearningEEG	CodeCode Available	1
Parrot-Trained Adversarial Examples: Pushing the Practicality of Black-Box Audio Attacks against Speaker Recognition Models	Nov 13, 2023	SentenceSpeaker Recognition	—Unverified	0
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation	Nov 8, 2023	Style TransferVoice Conversion	CodeCode Available	2
Non-Parallel Training Approach for Emotional Voice Conversion Using CycleGAN	Nov 1, 2023	Voice Conversion	CodeCode Available	0
Low-latency Real-time Voice Conversion on CPU	Nov 1, 2023	CPUKnowledge Distillation	CodeCode Available	2
An overview of text-to-speech systems and media applications	Oct 22, 2023	Acoustic Modellingtext-to-speech	—Unverified	0
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations	Oct 14, 2023	Self-Supervised LearningSpeaker Verification	—Unverified	0
Voice Conversion for Stuttered Speech, Instruments, Unseen Languages and Textually Described Voices	Oct 12, 2023	Voice Conversion	—Unverified	0

Show:10 25 50

← PrevPage 6 of 21Next →

All datasets ZeroSpeech 2019 English LibriSpeech test-clean VCTK

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	VQ-CPC	Speaker Similarity	3.8	—	Unverified
2	VQ-VAE	Speaker Similarity	3.49	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	kNN-VC (prematched HiFiGAN)	Character Error Rate (CER)	2.96	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DISSC	Total Length Error (TLE)	0.83	—	Unverified