| Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding | Jun 27, 2022 | Few-Shot Learningtext-to-speech | —Unverified | 0 |
| Synthesizing Personalized Non-speech Vocalization from Discrete Speech Representations | Jun 25, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue | Jun 24, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech | Jun 24, 2022 | Rhythmtext-to-speech | —Unverified | 0 |
| Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech | Jun 24, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| A Simple Baseline for Domain Adaptation in End to End ASR Systems Using Synthetic Data | Jun 22, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS | Jun 21, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| Towards Optimizing OCR for Accessibility | Jun 21, 2022 | Optical Character Recognition (OCR)text-to-speech | —Unverified | 0 |
| Automatic Prosody Annotation with Pre-Trained Text-Speech Model | Jun 16, 2022 | Speech Synthesistext-to-speech | CodeCode Available | 1 |
| NatiQ: An End-to-end Text-to-Speech System for Arabic | Jun 15, 2022 | Decodertext-to-speech | —Unverified | 0 |
| Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning | Jun 15, 2022 | AttributeEmotion Classification | CodeCode Available | 1 |
| A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation | Jun 10, 2022 | Machine Translationtext-to-speech | —Unverified | 0 |
| Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos | Jun 9, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| FlexLip: A Controllable Text-to-Lip System | Jun 7, 2022 | Audio Generationtext-to-speech | —Unverified | 0 |
| Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE | Jun 6, 2022 | Representation LearningSpeech Representation Learning | —Unverified | 0 |
| Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech | Jun 5, 2022 | Polyphone disambiguationtext-to-speech | CodeCode Available | 1 |
| BU-TTS: An Open-Source, Bilingual Welsh-English, Text-to-Speech Corpus | Jun 1, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| The Nós Project: Opening routes for the Galician language in the field of language technologies | Jun 1, 2022 | Cultural Vocal Bursts Intensity PredictionMachine Translation | —Unverified | 0 |
| Reading Assistance through LARA, the Learning And Reading Assistant | Jun 1, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| Exploring Transfer Learning for Urdu Speech Synthesis | Jun 1, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| An Open Source Web Reader for Under-Resourced Languages | Jun 1, 2022 | text-to-speechText to Speech | CodeCode Available | 0 |
| Huqariq: A Multilingual Speech Corpus of Native Languages of Peru forSpeech Recognition | Jun 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Text-to-Speech for Under-Resourced Languages: Phoneme Mapping and Source Language Selection in Transfer Learning | Jun 1, 2022 | Cross-Lingual Transfertext-to-speech | —Unverified | 0 |
| Building Open-source Speech Technology for Low-resource Minority Languages with SáMi as an Example – Tools, Methods and Experiments | Jun 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Investigating Inter- and Intra-speaker Voice Conversion using Audiobooks | Jun 1, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| ParlamentParla: A Speech Corpus of Catalan Parliamentary Sessions | Jun 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Audiobook Dialogues as Training Data for Conversational Style Synthetic Voices | Jun 1, 2022 | Sentencetext-to-speech | —Unverified | 0 |
| Using the LARA Little Prince to compare human and TTS audio quality | Jun 1, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| Error Annotation in Post-Editing Machine Translation: Investigating the Impact of Text-to-Speech Technology | Jun 1, 2022 | Machine Translationtext-to-speech | —Unverified | 0 |
| Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish | May 31, 2022 | Machine TranslationSpeech Synthesis | CodeCode Available | 0 |
| StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis | May 30, 2022 | Data AugmentationSelf-Supervised Learning | CodeCode Available | 2 |
| Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data | May 30, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| Exploiting Transliterated Words for Finding Similarity in Inter-Language News Articles using Machine Learning | May 29, 2022 | ArticlesMachine Translation | —Unverified | 0 |
| QSpeech: Low-Qubit Quantum Speech Application Toolkit | May 26, 2022 | text-to-speechText to Speech | CodeCode Available | 0 |
| T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation | May 24, 2022 | DecoderMachine Translation | —Unverified | 0 |
| PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit | May 20, 2022 | AllAutomatic Speech Recognition (ASR) | CodeCode Available | 6 |
| GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech | May 15, 2022 | Speech SynthesisStyle Transfer | CodeCode Available | 2 |
| Talking Face Generation with Multilingual TTS | May 13, 2022 | Face GenerationTalking Face Generation | —Unverified | 0 |
| NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality | May 9, 2022 | SentenceSpeech Synthesis | CodeCode Available | 2 |
| Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech | May 9, 2022 | Diversitytext-to-speech | CodeCode Available | 1 |
| ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence | May 9, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Systematic Inequalities in Language Technology Performance across the World’s Languages | May 1, 2022 | Dependency ParsingMachine Translation | CodeCode Available | 0 |
| Pretrained Speech Encoders and Efficient Fine-tuning Methods for Speech Translation: UPC at IWSLT 2022 | May 1, 2022 | DecoderKnowledge Distillation | CodeCode Available | 0 |
| Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss | Apr 28, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| LibriS2S: A German-English Speech-to-Speech Translation Corpus | Apr 22, 2022 | Speech-to-Speech TranslationSpeech-to-Text | CodeCode Available | 0 |
| FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis | Apr 21, 2022 | DenoisingGPU | CodeCode Available | 2 |
| Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation | Apr 21, 2022 | Data Augmentationtext-to-speech | —Unverified | 0 |
| Audio Deep Fake Detection System with Neural Stitching for ADD 2022 | Apr 19, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| Applying Feature Underspecified Lexicon Phonological Features in Multilingual Text-to-Speech | Apr 14, 2022 | Language Acquisitiontext-to-speech | —Unverified | 0 |
| Study of Indian English Pronunciation Variabilities relative to Received Pronunciation | Apr 13, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |