| Pretraining Techniques for Sequence-to-Sequence Voice Conversion | Aug 7, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes | Aug 7, 2020 | Gaussian ProcessesSpeech Synthesis | —Unverified | 0 |
| Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning | Aug 7, 2020 | Audio Generationreinforcement-learning | —Unverified | 0 |
| Phonological Features for 0-shot Multilingual Speech Synthesis | Aug 6, 2020 | Speech Synthesistext-to-speech | CodeCode Available | 1 |
| One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech | Aug 3, 2020 | Meta-LearningSpeech Synthesis | CodeCode Available | 1 |
| Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability | Jul 30, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Transfer Learning End-to-End ArabicText-To-Speech (TTS) Deep Architecture | Jul 22, 2020 | RhythmSpeech Synthesis | —Unverified | 0 |
| Normalizing Text using Language Modelling based on Phonetics and String Similarity | Jun 25, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Generic Indic Text-to-speech Synthesisers with Rapid Adaptation in an End-to-end Framework | Jun 12, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| FastPitch: Parallel Text-to-speech with Pitch Prediction | Jun 11, 2020 | Predictiontext-to-speech | CodeCode Available | 1 |
| FastSpeech 2: Fast and High-Quality End-to-End Text to Speech | Jun 8, 2020 | Knowledge DistillationSpeech Synthesis | CodeCode Available | 1 |
| MultiSpeech: Multi-Speaker Text to Speech with Transformer | Jun 8, 2020 | Decodertext-to-speech | CodeCode Available | 1 |
| Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning | Jun 5, 2020 | Self-Supervised LearningSpeaker Verification | CodeCode Available | 0 |
| End-to-End Adversarial Text-to-Speech | Jun 5, 2020 | Adversarial TextDynamic Time Warping | CodeCode Available | 1 |
| Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search | May 22, 2020 | text-to-speechText to Speech | CodeCode Available | 1 |
| NAUTILUS: a Versatile Voice Cloning System | May 22, 2020 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Cross-lingual Multispeaker Text-to-Speech under Limited-Data Scenario | May 21, 2020 | AttributeSpeech Synthesis | —Unverified | 0 |
| Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis | May 20, 2020 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech | May 19, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders | May 18, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation | May 16, 2020 | DecoderSpeech Synthesis | —Unverified | 0 |
| JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment | May 15, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation | May 14, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis | May 12, 2020 | Speech SynthesisStyle Transfer | CodeCode Available | 1 |
| DiscreTalk: Text-to-Speech as a Machine Translation Problem | May 12, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN | May 12, 2020 | Few-Shot Learningtext-to-speech | —Unverified | 0 |
| Exploring TTS without T Using Biologically/Psychologically Motivated Neural Network Modules (ZeroSpeech 2020) | May 11, 2020 | Clusteringspeech-recognition | CodeCode Available | 0 |
| Luganda Text-to-Speech Machine | May 11, 2020 | text-to-speechText to Speech | CodeCode Available | 0 |
| From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint | May 10, 2020 | Speaker VerificationSpeech Synthesis | CodeCode Available | 1 |
| Crowdsourcing Latin American Spanish for Low-Resource Text-to-Speech | May 1, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| Open-source Multi-speaker Speech Corpora for Building Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu Speech Synthesis Systems | May 1, 2020 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Burmese Speech Corpus, Finite-State Text Normalization and Pronunciation Grammars with an Application to Text-to-Speech | May 1, 2020 | Text Normalizationtext-to-speech | —Unverified | 0 |
| IndicSpeech: Text-to-Speech Corpus for Indian Languages | May 1, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| Corpus Generation for Voice Command in Smart Home and the Effect of Speech Synthesis on End-to-End SLU | May 1, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Development and Evaluation of Speech Synthesis Corpora for Latvian | May 1, 2020 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Neural Text-to-Speech Synthesis for an Under-Resourced Language in a Diglossic Environment: the Case of Gascon Occitan | May 1, 2020 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Open-Source High Quality Speech Datasets for Basque, Catalan and Galician | May 1, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Style Variation as a Vantage Point for Code-Switching | May 1, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech | Apr 30, 2020 | Rhythmtext-to-speech | —Unverified | 0 |
| A Study of Non-autoregressive Model for Sequence Generation | Apr 22, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| ESPnet-ST: All-in-One Speech Translation Toolkit | Apr 21, 2020 | AllAutomatic Speech Recognition | —Unverified | 0 |
| Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System | Apr 20, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| Transformer based Grapheme-to-Phoneme Conversion | Apr 14, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Scalable Multilingual Frontend for TTS | Apr 10, 2020 | ChunkingMachine Translation | —Unverified | 0 |
| Generating Multilingual Voices Using Speaker Space Translation Based on Bilingual Speaker Data | Apr 10, 2020 | text-to-speechText to Speech | —Unverified | 0 |
| Improving Readability for Automatic Speech Recognition Transcription | Apr 9, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| g2pM: A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset | Apr 7, 2020 | Grapheme-to-Phoneme ConversionPolyphone disambiguation | CodeCode Available | 1 |
| Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0 | Mar 14, 2020 | ClusteringRepresentation Learning | CodeCode Available | 1 |
| Statistical Context-Dependent Units Boundary Correction for Corpus-based Unit-Selection Text-to-Speech | Mar 5, 2020 | Segmentationtext-to-speech | —Unverified | 0 |
| AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment | Mar 4, 2020 | text-to-speechText to Speech | CodeCode Available | 0 |