Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples Aug 23, 2024 Data Augmentation Meta-Learning
— Unverified 0LCM-SVC: Latent Diffusion Model Based Singing Voice Conversion with Inference Acceleration via Latent Consistency Distillation Aug 22, 2024 Voice Conversion
— Unverified 0Hear Your Face: Face-based voice conversion with F0 estimation Aug 19, 2024 Voice Conversion
Code Code Available 0VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing Aug 11, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0MulliVC: Multi-lingual Voice Conversion With Cycle Consistency Aug 8, 2024 Voice Conversion
— Unverified 0Automatic Voice Identification after Speech Resynthesis using PPG Aug 5, 2024 Resynthesis Speaker Verification
— Unverified 0StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion Aug 5, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Towards Realistic Emotional Voice Conversion using Controllable Emotional Intensity Jul 20, 2024 Diversity Rhythm
— Unverified 0The VoicePrivacy 2022 Challenge: Progress and Perspectives in Voice Anonymisation Jul 16, 2024 Automatic Speech Recognition speech-recognition
— Unverified 0Source Tracing of Audio Deepfake Systems Jul 10, 2024 Face Swapping text-to-speech
— Unverified 0We Need Variations in Speech Generation: Sub-center Modelling for Speaker Embeddings Jul 5, 2024 Speaker Recognition Speech Synthesis
— Unverified 0Application of ASV for Voice Identification after VC and Duration Predictor Improvement in TTS Models Jun 27, 2024 Speaker Verification text-to-speech
— Unverified 0RefXVC: Cross-Lingual Voice Conversion with Enhanced Reference Leveraging Jun 24, 2024 Sentence Voice Conversion
— Unverified 0DreamVoice: Text-Guided Voice Conversion Jun 24, 2024 text-guided-generation Voice Conversion
— Unverified 0Improving child speech recognition with augmented child-like speech Jun 12, 2024 speech-recognition Speech Recognition
— Unverified 0DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models Jun 12, 2024 Voice Conversion Voice Similarity
— Unverified 0SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion Jun 9, 2024 SSIM Voice Conversion
— Unverified 0LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance Jun 8, 2024 Voice Conversion
— Unverified 0The Database and Benchmark for the Source Speaker Tracing Challenge 2024 Jun 7, 2024 Multi-Task Learning Speaker Verification
— Unverified 0Towards Naturalistic Voice Conversion: NaturalVoices Dataset with an Automatic Processing Pipeline Jun 6, 2024 Voice Conversion
— Unverified 0Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion Jun 4, 2024 In-Context Learning Language Modeling
— Unverified 0Real-Time and Accurate: Zero-shot High-Fidelity Singing Voice Conversion with Multi-Condition Flow Synthesis May 23, 2024 Attribute Decoder
— Unverified 0Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model May 2, 2024 Denoising Emotion Recognition
— Unverified 0Who is Authentic Speaker Apr 30, 2024 Speaker Recognition Voice Conversion
— Unverified 0Retrieval-Augmented Audio Deepfake Detection Apr 22, 2024 Audio Deepfake Detection DeepFake Detection
— Unverified 0PSCodec: A Series of High-Fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders Apr 3, 2024 Representation Learning Speaker Verification
— Unverified 0Voice Conversion Augmentation for Speaker Recognition on Defective Datasets Apr 1, 2024 Speaker Recognition Voice Conversion
— Unverified 0PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion Mar 3, 2024 Voice Conversion
— Unverified 0Transcription and translation of videos using fine-tuned XLSR Wav2Vec2 on custom dataset and mBART Mar 1, 2024 Retrieval Translation
— Unverified 0Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations Feb 5, 2024 Decoder In-Context Learning
— Unverified 0SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition Jan 31, 2024 Decoder Language Modeling
— Unverified 0SongBsAb: A Dual Prevention Approach against Singing Voice Conversion based Illegal Song Covers Jan 30, 2024 Voice Conversion
— Unverified 0Adversarial speech for voice privacy protection from Personalized Speech generation Jan 22, 2024 Speaker Verification text-to-speech
— Unverified 0StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion Jan 19, 2024 Language Modeling Language Modelling
— Unverified 0Transfer the linguistic representations from TTS to accent conversion with non-parallel data Jan 7, 2024 text-to-speech Text to Speech
— Unverified 0StreamVC: Real-Time Low-Latency Voice Conversion Jan 5, 2024 Speech Synthesis Voice Conversion
— Unverified 0Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion Dec 29, 2023 Contrastive Learning Disentanglement
— Unverified 0AE-Flow: AutoEncoder Normalizing Flow Dec 27, 2023 text-to-speech Text to Speech
— Unverified 0Exploring data augmentation in bias mitigation against non-native-accented speech Dec 24, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Creating New Voices using Normalizing Flows Dec 22, 2023 Speech Synthesis text-to-speech
— Unverified 0SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention Dec 14, 2023 Position Voice Conversion
— Unverified 0PerMod: Perceptually Grounded Voice Modification with Latent Diffusion Models Dec 13, 2023 Sentence Voice Conversion
— Unverified 0Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes Nov 29, 2023 Face Recognition Face Swapping
— Unverified 0Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion Nov 24, 2023 Data Augmentation Retrieval
— Unverified 0Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion Nov 14, 2023 Deep Learning Diversity
— Unverified 0Parrot-Trained Adversarial Examples: Pushing the Practicality of Black-Box Audio Attacks against Speaker Recognition Models Nov 13, 2023 Sentence Speaker Recognition
— Unverified 0Non-Parallel Training Approach for Emotional Voice Conversion Using CycleGAN Nov 1, 2023 Voice Conversion
Code Code Available 0An overview of text-to-speech systems and media applications Oct 22, 2023 Acoustic Modelling text-to-speech
— Unverified 0SelfVC: Voice Conversion With Iterative Refinement using Self Transformations Oct 14, 2023 Self-Supervised Learning Speaker Verification
— Unverified 0