SOTAVerified

Voice Conversion

I remember all the summer days Drinking wine in the sunshine I hope it never leaves And I remember all the summer nights Staring at you in the moonlight I hope you never leave 'cause baby You're so good to me You have all that all that I ever need It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you I remember all those winter days frozen In the cold tryin' to get you home Should I be moving in, we can be together then Remember spending all those winter nights Stayin' inside by the warm fire Yeah you gotta know that I can never let you go You and I have the rest of our lives to say It's easy to love you So easy to love you Ooh you know it's true The best part of being with you To know you're with me It's not so hard to say It's easy to love you Can anybody else see it? Mm, can anybody else see what I do? Can anybody else feel it? Oh, can anybody else feel the way I do? But now I'm with you Hard to forget all the moments when We'd be sitting there hoping it would never end 'Cause this is meant to be So baby, will you marry me? It's easy to love you So easy to love you Ooh, you know it's true The best part of being with you To know you are with me It's not so hard to say It's easy to love you You and me will be together I know our love will last forever You and me will be together I know our love will last forever You know it's true The best part of being with you You're easy to love

Source: Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

Papers

Showing 101150 of 520 papers

TitleStatusHype
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversionCode1
HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methodsCode1
Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipelineCode1
Voice Conversion Based on Cross-Domain Features Using Variational Auto EncodersCode1
Anonymizing Speech: Evaluating and Designing Speaker Anonymization TechniquesCode1
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal ConversionCode1
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence TrainingCode1
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice ConversionCode1
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake DetectionCode1
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneCode1
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram ConversionCode1
CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with TransformerCode1
AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace PlatformCode1
MOSNet: Deep Learning based Objective Assessment for Voice ConversionCode1
FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech DetectionCode1
FSD: An Initial Chinese Dataset for Fake Song DetectionCode1
F0-consistent many-to-many non-parallel voice conversion via conditional autoencoderCode1
Deep Learning Based Assessment of Synthetic Speech NaturalnessCode1
Evaluating Methods for Ground-Truth-Free Foreign Accent ConversionCode1
FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear ModulationCode1
GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion ModelsCode1
Low-Latency Real-Time Non-Parallel Voice Conversion based on Cyclic Variational Autoencoder and Multiband WaveRNN with Data-Driven Linear PredictionCode1
Retriever: Learning Content-Style Representation as a Token-Level Bipartite GraphCode1
Defending Your Voice: Adversarial Attack on Voice ConversionCode1
Toward Degradation-Robust Voice ConversionCode1
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints0
A Unified Model For Voice and Accent Conversion In Speech and Singing using Self-Supervised Learning and Feature Extraction0
Adaptive Speech Duration Modification using a Deep-Generative Framework0
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices0
Audio Deep Fake Detection System with Neural Stitching for ADD 20220
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR0
Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling0
Deep Learning-based F0 Synthesis for Speaker Anonymization0
Audio Anti-spoofing Using a Simple Attention Module and Joint Optimization Based on Additive Angular Margin Loss and Meta-learning0
Many-to-Many Voice Conversion with Out-of-Dataset Speaker Support0
End-to-End Voice Conversion with Information Perturbation0
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding0
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms0
Attentive activation function for improving end-to-end spoofing countermeasure systems0
Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE0
ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations0
D-CAPTCHA++: A Study of Resilience of Deepfake CAPTCHA under Transferable Imperceptible Adversarial Attack0
Data Augmentation for Diverse Voice Conversion in Noisy Environments0
Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion0
ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech0
An Adaptive Learning based Generative Adversarial Network for One-To-One Voice Conversion0
Emotion Intensity and its Control for Emotional Voice Conversion0
ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech0
CycleFlow: Purify Information Factors by Cycle Loss0
AC-VC: Non-parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion0
Show:102550
← PrevPage 3 of 11Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VQ-CPCSpeaker Similarity3.8Unverified
2VQ-VAESpeaker Similarity3.49Unverified
#ModelMetricClaimedVerifiedStatus
1kNN-VC (prematched HiFiGAN)Character Error Rate (CER)2.96Unverified
#ModelMetricClaimedVerifiedStatus
1DISSCTotal Length Error (TLE)0.83Unverified