SOTAVerified

Voice Conversion

I remember all the summer days Drinking wine in the sunshine I hope it never leaves And I remember all the summer nights Staring at you in the moonlight I hope you never leave 'cause baby You're so good to me You have all that all that I ever need It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you I remember all those winter days frozen In the cold tryin' to get you home Should I be moving in, we can be together then Remember spending all those winter nights Stayin' inside by the warm fire Yeah you gotta know that I can never let you go You and I have the rest of our lives to say It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you Can anybody else see it? Mm, can anybody else see what I do? Can anybody else feel it? Oh, can anybody else feel the way I do? But now I'm with you Hard to forget all the moments when We'd be sitting there hoping it would never end 'Cause this is meant to be So baby, will you marry me? It's easy to love you So easy to love you Ooh, you know it's true The best part of being with you To know you are with me It's not so hard to say It's easy to love you You and me will be together I know our love will last forever You and me will be together I know our love will last forever You know it's true The best part of being with you You're easy to love

Source: Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

Papers

Showing 151200 of 520 papers

TitleStatusHype
AC-VC: Non-parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion0
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion0
CTEFM-VC: Zero-Shot Voice Conversion Based on Content-Aware Timbre Ensemble Modeling and Flow Matching0
ALO-VC: Any-to-any Low-latency One-shot Voice Conversion0
Automatic Voice Identification after Speech Resynthesis using PPG0
Cross-speaker style transfer for text-to-speech using data augmentation0
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation0
Crossmodal Voice Conversion0
A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment0
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment0
High-quality nonparallel voice conversion based on cycle-consistent adversarial network0
Cross-modal Face- and Voice-style Transfer0
Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation0
Cross-lingual Knowledge Distillation via Flow-based Voice Conversion for Robust Polyglot Text-To-Speech0
Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models0
ArVoice: A Multi-Speaker Dataset for Arabic Speech Synthesis0
A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion0
A Regression Model of Recurrent Deep Neural Networks for Noise Robust Estimation of the Fundamental Frequency Contour of Speech0
Generating and Detecting Various Types of Fake Image and Audio Content: A Review of Modern Deep Learning Technologies and Tools0
AE-Flow: AutoEncoder Normalizing Flow0
Learning Speech Representation From Contrastive Token-Acoustic Pretraining0
Generalizable Audio Deepfake Detection via Latent Space Refinement and Augmentation0
CO-VADA: A Confidence-Oriented Voice Augmentation Debiasing Approach for Fair Speech Emotion Recognition0
High Fidelity Speech Regeneration with Application to Speech Enhancement0
HLTCOE JHU Submission to the Voice Privacy Challenge 20240
Improved disentangled speech representations using contrastive learning in factorized hierarchical variational autoencoder0
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion0
Are disentangled representations all you need to build speaker anonymization systems?0
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation0
FastVC: Fast Voice Conversion with non-parallel data0
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model0
Adversarial Transformation of Spoofing Attacks for Voice Biometrics0
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion0
FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning0
Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos0
SongBsAb: A Dual Prevention Approach against Singing Voice Conversion based Illegal Song Covers0
FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks0
Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment0
Conditional Deep Hierarchical Variational Autoencoder for Voice Conversion0
EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion0
Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer0
Comparison of Speech Representations for the MOS Prediction System0
Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations0
Generalization of Spectrum Differential based Direct Waveform Modification for Voice Conversion0
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion0
Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements0
Adversarial speech for voice privacy protection from Personalized Speech generation0
Creating New Voices using Normalizing Flows0
GenVC: Self-Supervised Zero-Shot Voice Conversion0
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features0
Show:102550
← PrevPage 4 of 11Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VQ-CPCSpeaker Similarity3.8Unverified
2VQ-VAESpeaker Similarity3.49Unverified
#ModelMetricClaimedVerifiedStatus
1kNN-VC (prematched HiFiGAN)Character Error Rate (CER)2.96Unverified
#ModelMetricClaimedVerifiedStatus
1DISSCTotal Length Error (TLE)0.83Unverified