| Speech denoising by parametric resynthesis | Apr 2, 2019 | DenoisingResynthesis | —Unverified | 0 |
| Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora | Apr 1, 2019 | text-to-speechText to Speech | —Unverified | 0 |
| ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworks | Apr 1, 2019 | Feature Engineeringtext-to-speech | CodeCode Available | 0 |
| Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet | Mar 29, 2019 | DecoderSpeech Synthesis | —Unverified | 0 |
| CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages | Mar 27, 2019 | text-to-speechText to Speech | CodeCode Available | 0 |
| Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis | Mar 14, 2019 | Generative Adversarial NetworkSpeech Synthesis | —Unverified | 0 |
| Deep Text-to-Speech System with Seq2Seq Model | Mar 11, 2019 | modelSpeech Synthesis | —Unverified | 0 |
| Data Efficient Voice Cloning for Neural Singing Synthesis | Feb 19, 2019 | text-to-speechText to Speech | —Unverified | 0 |
| Unsupervised Polyglot Text To Speech | Feb 6, 2019 | text-to-speechText to Speech | —Unverified | 0 |
| Hand Sign to Bangla Speech: A Deep Learning in Vision based system for Recognizing Hand Sign Digits and Generating Bangla Speech | Jan 17, 2019 | Gesture Recognitiontext-to-speech | —Unverified | 0 |
| Feature reinforcement with word embedding and parsing information in neural TTS | Jan 3, 2019 | Sentencetext-to-speech | —Unverified | 0 |
| FPETS : Fully Parallel End-to-End Text-to-Speech System | Dec 12, 2018 | text-to-speechText to Speech | CodeCode Available | 0 |
| Generative Adversarial Network based Speaker Adaptation for High Fidelity WaveNet Vocoder | Dec 6, 2018 | Generative Adversarial Networktext-to-speech | —Unverified | 0 |
| AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms | Nov 9, 2018 | GPUImage Captioning | —Unverified | 0 |
| Speaker-adaptive neural vocoders for parametric speech synthesis systems | Nov 8, 2018 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation | Nov 5, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Cycle-consistency training for end-to-end speech recognition | Nov 2, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator | Oct 31, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks | Oct 30, 2018 | Image GenerationSpeech Synthesis | —Unverified | 0 |
| Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization | Oct 30, 2018 | Data AugmentationDisentanglement | —Unverified | 0 |
| Neural source-filter-based waveform model for statistical parametric speech synthesis | Oct 29, 2018 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention | Oct 29, 2018 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language | Oct 29, 2018 | Speech Synthesistext-to-speech | CodeCode Available | 0 |
| A Deep Generative Acoustic Model for Compositional Automatic Speech Recognition | Oct 23, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Fully Time-domain Neural Model for Subband-based Speech Synthesizer | Oct 22, 2018 | text-to-speechText to Speech | —Unverified | 0 |
| Hierarchical Generative Modeling for Controllable Speech Synthesis | Oct 16, 2018 | AttributeSpeech Synthesis | CodeCode Available | 0 |
| Diacritization of Maghrebi Arabic Sub-Dialects | Oct 15, 2018 | text-to-speechText to Speech | —Unverified | 0 |
| A Fully Time-domain Neural Model for Subband-based Speech Synthesizer | Oct 12, 2018 | text-to-speechText to Speech | CodeCode Available | 0 |
| 台語古詩朗誦系統A Taiwanese Text-to-Speech System for Ancient Poems[In Chinese] | Oct 1, 2018 | text-to-speechText to Speech | —Unverified | 0 |
| A Challenge Set and Methods for Noun-Verb Ambiguity | Oct 1, 2018 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Sample Efficient Adaptive Text-to-Speech | Sep 27, 2018 | Meta-Learningtext-to-speech | —Unverified | 0 |
| Self-Attention Linguistic-Acoustic Decoder | Aug 31, 2018 | CPUDecoder | —Unverified | 0 |
| Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis | Aug 30, 2018 | DecoderSpeech Synthesis | —Unverified | 0 |
| Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis | Aug 4, 2018 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder | Jul 31, 2018 | Generative Adversarial NetworkSpeech Synthesis | —Unverified | 0 |
| Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model | Jul 22, 2018 | DecoderSequence-To-Sequence Speech Recognition | —Unverified | 0 |
| Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages | Jul 1, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis | Jun 12, 2018 | Speaker VerificationSpeech Synthesis | CodeCode Available | 0 |
| Voice Imitating Text-to-Speech Neural Networks | Jun 4, 2018 | Sentencetext-to-speech | —Unverified | 0 |
| Voice Builder: A Tool for Building Text-To-Speech Voices | May 1, 2018 | text-to-speechText to Speech | —Unverified | 0 |
| Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech | May 1, 2018 | Automatic Speech Recognition (ASR)Speech Recognition | —Unverified | 0 |
| Speaker-independent raw waveform model for glottal excitation | Apr 25, 2018 | modelSpeech Synthesis | —Unverified | 0 |
| Machine Speech Chain with One-shot Speaker Adaptation | Mar 28, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speech to text and text to speech recognition systems-Areview | Mar 17, 2018 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data | Mar 2, 2018 | Generative Adversarial NetworkSpeech Enhancement | —Unverified | 0 |
| Deep Feed-forward Sequential Memory Networks for Speech Synthesis | Feb 26, 2018 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Fitting New Speakers Based on a Short Untranscribed Sample | Feb 20, 2018 | Speech Synthesistext-to-speech | —Unverified | 0 |
| Tools and resources for Romanian text-to-speech and speech-to-text applications | Feb 15, 2018 | speech-recognitionSpeech Recognition | CodeCode Available | 0 |
| An Implementation of Back-Propagation Learning on GF11, a Large SIMD Parallel Computer | Jan 4, 2018 | Neural Network simulationtext-to-speech | —Unverified | 0 |
| HybridNet: A Hybrid Neural Architecture to Speed-up Autoregressive Models | Jan 1, 2018 | Speech Synthesistext-to-speech | —Unverified | 0 |