Voice Conversion

I remember all the summer days Drinking wine in the sunshine I hope it never leaves And I remember all the summer nights Staring at you in the moonlight I hope you never leave 'cause baby You're so good to me You have all that all that I ever need It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you I remember all those winter days frozen In the cold tryin' to get you home Should I be moving in, we can be together then Remember spending all those winter nights Stayin' inside by the warm fire Yeah you gotta know that I can never let you go You and I have the rest of our lives to say It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you Can anybody else see it? Mm, can anybody else see what I do? Can anybody else feel it? Oh, can anybody else feel the way I do? But now I'm with you Hard to forget all the moments when We'd be sitting there hoping it would never end 'Cause this is meant to be So baby, will you marry me? It's easy to love you So easy to love you Ooh, you know it's true The best part of being with you To know you are with me It's not so hard to say It's easy to love you You and me will be together I know our love will last forever You and me will be together I know our love will last forever You know it's true The best part of being with you You're easy to love

Source: Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 520 papers

Title	Date	Tasks	Status	Hype
Coding Speech through Vocal Tract Kinematics	Jun 18, 2024	Voice Conversion	CodeCode Available	2
Improving child speech recognition with augmented child-like speech	Jun 12, 2024	speech-recognitionSpeech Recognition	—Unverified	0
DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion	Jun 12, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models	Jun 12, 2024	Voice ConversionVoice Similarity	—Unverified	0
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion	Jun 9, 2024	SSIMVoice Conversion	—Unverified	0
LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance	Jun 8, 2024	Voice Conversion	—Unverified	0
The Database and Benchmark for the Source Speaker Tracing Challenge 2024	Jun 7, 2024	Multi-Task LearningSpeaker Verification	—Unverified	0
Towards Naturalistic Voice Conversion: NaturalVoices Dataset with an Automatic Processing Pipeline	Jun 6, 2024	Voice Conversion	—Unverified	0
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion	Jun 4, 2024	In-Context LearningLanguage Modeling	—Unverified	0
Real-Time and Accurate: Zero-shot High-Fidelity Singing Voice Conversion with Multi-Condition Flow Synthesis	May 23, 2024	AttributeDecoder	—Unverified	0
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model	May 2, 2024	DenoisingEmotion Recognition	—Unverified	0
Who is Authentic Speaker	Apr 30, 2024	Speaker RecognitionVoice Conversion	—Unverified	0
FlashSpeech: Efficient Zero-Shot Speech Synthesis	Apr 23, 2024	RhythmSpeech Synthesis	CodeCode Available	3
Retrieval-Augmented Audio Deepfake Detection	Apr 22, 2024	Audio Deepfake DetectionDeepFake Detection	—Unverified	0
PSCodec: A Series of High-Fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders	Apr 3, 2024	Representation LearningSpeaker Verification	—Unverified	0
Voice Conversion Augmentation for Speaker Recognition on Defective Datasets	Apr 1, 2024	Speaker RecognitionVoice Conversion	—Unverified	0
PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion	Mar 3, 2024	Voice Conversion	—Unverified	0
Transcription and translation of videos using fine-tuned XLSR Wav2Vec2 on custom dataset and mBART	Mar 1, 2024	RetrievalTranslation	—Unverified	0
High-Fidelity Neural Phonetic Posteriorgrams	Feb 27, 2024	Voice Conversion	CodeCode Available	2
Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations	Feb 5, 2024	DecoderIn-Context Learning	—Unverified	0
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition	Jan 31, 2024	DecoderLanguage Modeling	—Unverified	0
SongBsAb: A Dual Prevention Approach against Singing Voice Conversion based Illegal Song Covers	Jan 30, 2024	Voice Conversion	—Unverified	0
SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation	Jan 24, 2024	text-to-speechText to Speech	CodeCode Available	5
Adversarial speech for voice privacy protection from Personalized Speech generation	Jan 22, 2024	Speaker Verificationtext-to-speech	—Unverified	0
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion	Jan 19, 2024	Language ModelingLanguage Modelling	—Unverified	0
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment	Jan 16, 2024	DisentanglementSelf-Supervised Learning	CodeCode Available	2
Transfer the linguistic representations from TTS to accent conversion with non-parallel data	Jan 7, 2024	text-to-speechText to Speech	—Unverified	0
StreamVC: Real-Time Low-Latency Voice Conversion	Jan 5, 2024	Speech SynthesisVoice Conversion	—Unverified	0
CoMoSVC: Consistency Model-based Singing Voice Conversion	Jan 3, 2024	GPUmodel	CodeCode Available	2
Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion	Dec 29, 2023	Contrastive LearningDisentanglement	—Unverified	0
AE-Flow: AutoEncoder Normalizing Flow	Dec 27, 2023	text-to-speechText to Speech	—Unverified	0
Exploring data augmentation in bias mitigation against non-native-accented speech	Dec 24, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Creating New Voices using Normalizing Flows	Dec 22, 2023	Speech Synthesistext-to-speech	—Unverified	0
AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform	Dec 17, 2023	Image SegmentationSegmentation	CodeCode Available	1
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection	Dec 15, 2023	Audio Deepfake DetectionContinual Learning	CodeCode Available	1
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention	Dec 14, 2023	PositionVoice Conversion	—Unverified	0
PerMod: Perceptually Grounded Voice Modification with Latent Diffusion Models	Dec 13, 2023	SentenceVoice Conversion	—Unverified	0
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes	Nov 29, 2023	Face RecognitionFace Swapping	—Unverified	0
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion	Nov 24, 2023	Data AugmentationRetrieval	—Unverified	0
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis	Nov 21, 2023	Speech SynthesisSuper-Resolution	CodeCode Available	3
Improving fairness for spoken language understanding in atypical speech with Text-to-Speech	Nov 16, 2023	Data AugmentationFairness	CodeCode Available	1
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion	Nov 14, 2023	Deep LearningDiversity	—Unverified	0
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal Conversion	Nov 13, 2023	Contrastive LearningEEG	CodeCode Available	1
Parrot-Trained Adversarial Examples: Pushing the Practicality of Black-Box Audio Attacks against Speaker Recognition Models	Nov 13, 2023	SentenceSpeaker Recognition	—Unverified	0
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation	Nov 8, 2023	Style TransferVoice Conversion	CodeCode Available	2
Non-Parallel Training Approach for Emotional Voice Conversion Using CycleGAN	Nov 1, 2023	Voice Conversion	CodeCode Available	0
Low-latency Real-time Voice Conversion on CPU	Nov 1, 2023	CPUKnowledge Distillation	CodeCode Available	2
An overview of text-to-speech systems and media applications	Oct 22, 2023	Acoustic Modellingtext-to-speech	—Unverified	0
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations	Oct 14, 2023	Self-Supervised LearningSpeaker Verification	—Unverified	0
Voice Conversion for Stuttered Speech, Instruments, Unseen Languages and Textually Described Voices	Oct 12, 2023	Voice Conversion	—Unverified	0

Show:10 25 50

← PrevPage 3 of 11Next →

All datasets ZeroSpeech 2019 English LibriSpeech test-clean VCTK

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	VQ-CPC	Speaker Similarity	3.8	—	Unverified
2	VQ-VAE	Speaker Similarity	3.49	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	kNN-VC (prematched HiFiGAN)	Character Error Rate (CER)	2.96	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DISSC	Total Length Error (TLE)	0.83	—	Unverified