Voice Conversion

I remember all the summer days Drinking wine in the sunshine I hope it never leaves And I remember all the summer nights Staring at you in the moonlight I hope you never leave 'cause baby You're so good to me You have all that all that I ever need It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you I remember all those winter days frozen In the cold tryin' to get you home Should I be moving in, we can be together then Remember spending all those winter nights Stayin' inside by the warm fire Yeah you gotta know that I can never let you go You and I have the rest of our lives to say It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you Can anybody else see it? Mm, can anybody else see what I do? Can anybody else feel it? Oh, can anybody else feel the way I do? But now I'm with you Hard to forget all the moments when We'd be sitting there hoping it would never end 'Cause this is meant to be So baby, will you marry me? It's easy to love you So easy to love you Ooh, you know it's true The best part of being with you To know you are with me It's not so hard to say It's easy to love you You and me will be together I know our love will last forever You and me will be together I know our love will last forever You know it's true The best part of being with you You're easy to love

Source: Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 520 papers

Title	Date	Tasks	Status
Preserving background sound in noise-robust voice conversion via multi-task learning	Nov 6, 2022	Multi-Task LearningVoice Conversion	—Unverified
Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation	Oct 31, 2022	DecoderDisentanglement	—Unverified
Combining Automatic Speaker Verification and Prosody Analysis for Synthetic Speech Detection	Oct 31, 2022	Audio CompressionFace Swapping	—Unverified
Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance	Oct 27, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
V-Cloak: Intelligibility-, Naturalness- & Timbre-Preserving Real-Time Voice Anonymization	Oct 27, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE	Oct 25, 2022	DisentanglementRepresentation Learning	—Unverified
Mixed-EVC: Mixed Emotion Synthesis and Control in Voice Conversion	Oct 25, 2022	AttributeVoice Conversion	—Unverified
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse	Oct 25, 2022	Voice Conversion	—Unverified
Robust One-Shot Singing Voice Conversion	Oct 20, 2022	Voice Conversion	—Unverified
DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion	Oct 20, 2022	Voice Conversion	—Unverified
Boosting Star-GANs for Voice Conversion with Contrastive Discriminator	Sep 21, 2022	Contrastive LearningVoice Conversion	—Unverified
Non-Parallel Voice Conversion for ASR Augmentation	Sep 15, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset	Sep 14, 2022	text-to-speechText to Speech	—Unverified
Investigation into Target Speaking Rate Adaptation for Voice Conversion	Sep 5, 2022	DisentanglementRepresentation Learning	—Unverified
Are disentangled representations all you need to build speaker anonymization systems?	Aug 22, 2022	AllAutomatic Speech Recognition	—Unverified
Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer	Aug 15, 2022	Style TransferVoice Conversion	—Unverified
TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training	Aug 8, 2022	Voice Conversion	—Unverified
Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation	Jul 29, 2022	Data Augmentationtext-to-speech	—Unverified
Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis	Jul 25, 2022	Data AugmentationSpeech Synthesis	—Unverified
GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion	Jul 4, 2022	Voice Conversion	—Unverified
A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion	Jun 28, 2022	Speaker RecognitionVoice Conversion	—Unverified
Comparison of Speech Representations for the MOS Prediction System	Jun 28, 2022	Self-Supervised Learningtext-to-speech	—Unverified
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems	Jun 18, 2022	Speaker IdentificationSpeaker Verification	—Unverified
End-to-End Voice Conversion with Information Perturbation	Jun 15, 2022	Voice Conversion	—Unverified
Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos	Jun 9, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Investigating Inter- and Intra-speaker Voice Conversion using Audiobooks	Jun 1, 2022	Speech Synthesistext-to-speech	—Unverified
Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts	May 10, 2022	Speech SynthesisVoice Conversion	CodeCode Available
Attentive activation function for improving end-to-end spoofing countermeasure systems	May 3, 2022	Speech SynthesisVoice Conversion	—Unverified
Multi-task learning improves synthetic speech detection	Apr 27, 2022	Multi-Task LearningSpeaker Verification	CodeCode Available
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation	Apr 21, 2022	Data Augmentationtext-to-speech	—Unverified
Audio Deep Fake Detection System with Neural Stitching for ADD 2022	Apr 19, 2022	text-to-speechText to Speech	—Unverified
Time Domain Adversarial Voice Conversion for ADD 2022	Apr 19, 2022	Voice Conversion	—Unverified
The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance	Apr 11, 2022	Speaker VerificationSpeech Synthesis	—Unverified
Representation Selective Self-distillation and wav2vec 2.0 Feature Exploration for Spoof-aware Speaker Verification	Apr 6, 2022	AttributeSpeaker Verification	—Unverified
Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective	Apr 5, 2022	DisentanglementRepresentation Learning	—Unverified
Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices	Apr 4, 2022	Speaker Verificationspeech-recognition	—Unverified
Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck	Apr 4, 2022	Speaker Verificationtext-to-speech	—Unverified
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis	Apr 1, 2022	Speech SynthesisVoice Conversion	CodeCode Available
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis	Mar 31, 2022	Speech Synthesistext-to-speech	—Unverified
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE	Mar 30, 2022	DecoderSentence	—Unverified
An Overview & Analysis of Sequence-to-Sequence Emotional Voice Conversion	Mar 29, 2022	RhythmVoice Conversion	—Unverified
Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE	Mar 28, 2022	Speech SynthesisVoice Conversion	—Unverified
A Speech Representation Anonymization Framework via Selective Noise Perturbation	Mar 26, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion	Mar 24, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Separating Content from Speaker Identity in Speech for the Assessment of Cognitive Impairments	Mar 21, 2022	Speaker VerificationVoice Conversion	—Unverified
Improve few-shot voice cloning using multi-modal learning	Mar 18, 2022	text-to-speechText to Speech	—Unverified
Text-free non-parallel many-to-many voice conversion using normalising flows	Mar 15, 2022	Normalising FlowsSpeech Synthesis	—Unverified
VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion	Feb 18, 2022	QuantizationSpeech Synthesis	—Unverified
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module	Feb 16, 2022	Speech Synthesistext-to-speech	—Unverified
Partially Fake Audio Detection by Self-attention-based Fake Span Discovery	Feb 14, 2022	Open-Ended Question AnsweringQuestion Answering	—Unverified

Show:10 25 50

← PrevPage 7 of 11Next →

All datasets ZeroSpeech 2019 English LibriSpeech test-clean VCTK

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	VQ-CPC	Speaker Similarity	3.8	—	Unverified
2	VQ-VAE	Speaker Similarity	3.49	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	kNN-VC (prematched HiFiGAN)	Character Error Rate (CER)	2.96	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DISSC	Total Length Error (TLE)	0.83	—	Unverified