Coding Speech through Vocal Tract Kinematics Jun 18, 2024 Voice Conversion
Code Code Available 2Improving child speech recognition with augmented child-like speech Jun 12, 2024 speech-recognition Speech Recognition
— Unverified 0DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models Jun 12, 2024 Voice Conversion Voice Similarity
— Unverified 0SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion Jun 9, 2024 SSIM Voice Conversion
— Unverified 0LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance Jun 8, 2024 Voice Conversion
— Unverified 0The Database and Benchmark for the Source Speaker Tracing Challenge 2024 Jun 7, 2024 Multi-Task Learning Speaker Verification
— Unverified 0Towards Naturalistic Voice Conversion: NaturalVoices Dataset with an Automatic Processing Pipeline Jun 6, 2024 Voice Conversion
— Unverified 0Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion Jun 4, 2024 In-Context Learning Language Modeling
— Unverified 0Real-Time and Accurate: Zero-shot High-Fidelity Singing Voice Conversion with Multi-Condition Flow Synthesis May 23, 2024 Attribute Decoder
— Unverified 0Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model May 2, 2024 Denoising Emotion Recognition
— Unverified 0Who is Authentic Speaker Apr 30, 2024 Speaker Recognition Voice Conversion
— Unverified 0FlashSpeech: Efficient Zero-Shot Speech Synthesis Apr 23, 2024 Rhythm Speech Synthesis
Code Code Available 3Retrieval-Augmented Audio Deepfake Detection Apr 22, 2024 Audio Deepfake Detection DeepFake Detection
— Unverified 0PSCodec: A Series of High-Fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders Apr 3, 2024 Representation Learning Speaker Verification
— Unverified 0Voice Conversion Augmentation for Speaker Recognition on Defective Datasets Apr 1, 2024 Speaker Recognition Voice Conversion
— Unverified 0PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion Mar 3, 2024 Voice Conversion
— Unverified 0Transcription and translation of videos using fine-tuned XLSR Wav2Vec2 on custom dataset and mBART Mar 1, 2024 Retrieval Translation
— Unverified 0High-Fidelity Neural Phonetic Posteriorgrams Feb 27, 2024 Voice Conversion
Code Code Available 2Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations Feb 5, 2024 Decoder In-Context Learning
— Unverified 0SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition Jan 31, 2024 Decoder Language Modeling
— Unverified 0SongBsAb: A Dual Prevention Approach against Singing Voice Conversion based Illegal Song Covers Jan 30, 2024 Voice Conversion
— Unverified 0SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation Jan 24, 2024 text-to-speech Text to Speech
Code Code Available 5Adversarial speech for voice privacy protection from Personalized Speech generation Jan 22, 2024 Speaker Verification text-to-speech
— Unverified 0StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion Jan 19, 2024 Language Modeling Language Modelling
— Unverified 0DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment Jan 16, 2024 Disentanglement Self-Supervised Learning
Code Code Available 2Transfer the linguistic representations from TTS to accent conversion with non-parallel data Jan 7, 2024 text-to-speech Text to Speech
— Unverified 0StreamVC: Real-Time Low-Latency Voice Conversion Jan 5, 2024 Speech Synthesis Voice Conversion
— Unverified 0CoMoSVC: Consistency Model-based Singing Voice Conversion Jan 3, 2024 GPU model
Code Code Available 2Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion Dec 29, 2023 Contrastive Learning Disentanglement
— Unverified 0AE-Flow: AutoEncoder Normalizing Flow Dec 27, 2023 text-to-speech Text to Speech
— Unverified 0Exploring data augmentation in bias mitigation against non-native-accented speech Dec 24, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Creating New Voices using Normalizing Flows Dec 22, 2023 Speech Synthesis text-to-speech
— Unverified 0AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform Dec 17, 2023 Image Segmentation Segmentation
Code Code Available 1What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection Dec 15, 2023 Audio Deepfake Detection Continual Learning
Code Code Available 1SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention Dec 14, 2023 Position Voice Conversion
— Unverified 0PerMod: Perceptually Grounded Voice Modification with Latent Diffusion Models Dec 13, 2023 Sentence Voice Conversion
— Unverified 0Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes Nov 29, 2023 Face Recognition Face Swapping
— Unverified 0Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion Nov 24, 2023 Data Augmentation Retrieval
— Unverified 0HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis Nov 21, 2023 Speech Synthesis Super-Resolution
Code Code Available 3Improving fairness for spoken language understanding in atypical speech with Text-to-Speech Nov 16, 2023 Data Augmentation Fairness
Code Code Available 1Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion Nov 14, 2023 Deep Learning Diversity
— Unverified 0CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal Conversion Nov 13, 2023 Contrastive Learning EEG
Code Code Available 1Parrot-Trained Adversarial Examples: Pushing the Practicality of Black-Box Audio Attacks against Speaker Recognition Models Nov 13, 2023 Sentence Speaker Recognition
— Unverified 0Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation Nov 8, 2023 Style Transfer Voice Conversion
Code Code Available 2Non-Parallel Training Approach for Emotional Voice Conversion Using CycleGAN Nov 1, 2023 Voice Conversion
Code Code Available 0Low-latency Real-time Voice Conversion on CPU Nov 1, 2023 CPU Knowledge Distillation
Code Code Available 2An overview of text-to-speech systems and media applications Oct 22, 2023 Acoustic Modelling text-to-speech
— Unverified 0SelfVC: Voice Conversion With Iterative Refinement using Self Transformations Oct 14, 2023 Self-Supervised Learning Speaker Verification
— Unverified 0Voice Conversion for Stuttered Speech, Instruments, Unseen Languages and Textually Described Voices Oct 12, 2023 Voice Conversion
— Unverified 0