SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs Jul 18, 2023 Generative Adversarial Network Language Modeling
— Unverified 0High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units Jun 29, 2023 Speech Synthesis text-to-speech
— Unverified 0Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Jun 23, 2023 In-Context Learning Speech Synthesis
Code Code Available 0ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models May 23, 2023 Speech Synthesis text-to-speech
— Unverified 0VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages May 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting May 19, 2023 Speech Synthesis text-to-speech
— Unverified 0A unified front-end framework for English text-to-speech synthesis May 18, 2023 Speech Synthesis Text Normalization
— Unverified 0Accented Text-to-Speech Synthesis with Limited Data May 8, 2023 Speech Synthesis text-to-speech
— Unverified 0M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis May 3, 2023 Speech Synthesis text-to-speech
— Unverified 0A Review of Deep Learning Techniques for Speech Processing Apr 30, 2023 Automatic Speech Recognition Deep Learning
— Unverified 0Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model Apr 24, 2023 Rhythm Self-Supervised Learning
— Unverified 0Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis Mar 27, 2023 All Automatic Speech Recognition
— Unverified 0A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI Mar 23, 2023 Speech Enhancement Speech Synthesis
— Unverified 0Controllable Prosody Generation With Partial Inputs Mar 14, 2023 Speech Synthesis text-to-speech
— Unverified 0Do Prosody Transfer Models Transfer Prosody? Mar 7, 2023 Speech Synthesis text-to-speech
— Unverified 0ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations Mar 1, 2023 Self-Supervised Learning Speech Synthesis
— Unverified 0UzbekTagger: The rule-based POS tagger for Uzbek language Jan 30, 2023 Language Modeling Language Modelling
— Unverified 0Applying Automated Machine Translation to Educational Video Courses Jan 9, 2023 Machine Translation Speech Synthesis
— Unverified 0ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration Jan 1, 2023 Audio-Visual Speech Recognition Resynthesis
— Unverified 0ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement Dec 21, 2022 Audio-Visual Speech Recognition Resynthesis
— Unverified 0Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder Dec 16, 2022 Representation Learning Speech Synthesis
— Unverified 0Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language Dec 16, 2022 Language Modeling Language Modelling
— Unverified 0Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models Nov 17, 2022 Speech Synthesis text-to-speech
— Unverified 0Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages Nov 1, 2022 Chunking Rhythm
— Unverified 0Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech Oct 27, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Oct 6, 2022 Speech Synthesis text-to-speech
— Unverified 0Controllable Accented Text-to-Speech Synthesis Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 0EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 0Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers Sep 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model Jul 4, 2022 Language Modeling Language Modelling
— Unverified 0R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS Jun 30, 2022 Decoder GPU
— Unverified 0Exploring Transfer Learning for Urdu Speech Synthesis Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 0BU-TTS: An Open-Source, Bilingual Welsh-English, Text-to-Speech Corpus Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 0Investigating Inter- and Intra-speaker Voice Conversion using Audiobooks Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 0Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish May 31, 2022 Machine Translation Speech Synthesis
Code Code Available 0ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence May 9, 2022 Speech Synthesis text-to-speech
— Unverified 0Systematic Inequalities in Language Technology Performance across the World’s Languages May 1, 2022 Dependency Parsing Machine Translation
Code Code Available 0The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance Apr 11, 2022 Speaker Verification Speech Synthesis
— Unverified 0SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis Apr 6, 2022 Speech Synthesis text-to-speech
— Unverified 0VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature Apr 2, 2022 Speech Synthesis text-to-speech
— Unverified 0Applying Syntaxx2013Prosody Mapping Hypothesis and Prosodic Well-Formedness Constraints to Neural Sequence-to-Sequence Speech Synthesis Mar 29, 2022 Speech Synthesis text-to-speech
— Unverified 0AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling Mar 21, 2022 Decoder Speech Synthesis
— Unverified 0ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis Mar 20, 2022 Speaker Verification Speech Synthesis
Code Code Available 0Text-free non-parallel many-to-many voice conversion using normalising flows Mar 15, 2022 Normalising Flows Speech Synthesis
— Unverified 0Deep Performer: Score-to-Audio Music Performance Synthesis Feb 12, 2022 Decoder Speech Synthesis
— Unverified 0Multi-Stage Deep Transfer Learning for EmIoT-enabled Human-Computer Interaction Feb 3, 2022 Human-Object Interaction Detection text-to-speech
— Unverified 0Transformer-based Models of Text Normalization for Speech Applications Feb 1, 2022 Sentence Speech Synthesis
— Unverified 0Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios Dec 23, 2021 Diversity Speech Synthesis
— Unverified 0Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance Nov 23, 2021 speech-recognition Speech Recognition
— Unverified 0Systematic Inequalities in Language Technology Performance across the World's Languages Oct 13, 2021 Dependency Parsing Machine Translation
Code Code Available 0