Text-To-Speech Synthesis

Text-To-Speech Synthesis is a machine learning task that involves converting written text into spoken words. The goal is to generate synthetic speech that sounds natural and resembles human speech as closely as possible.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–332 of 332 papers

Title	Date	Tasks	Status
Emphasized Accent Phrase Prediction from Text for Advertisement Text-To-Speech Synthesis	Dec 1, 2014	Speech Synthesistext-to-speech	—Unverified
Duration Modeling by Multi-Models based on Vowel Production characteristics	Dec 1, 2014	Speech SynthesisText-To-Speech Synthesis	—Unverified
Hippocratic Abbreviation Expansion	Jun 1, 2014	Information RetrievalMachine Translation	—Unverified
RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus	May 1, 2014	Speech Synthesistext-to-speech	—Unverified
Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and A Network-based ASR System	May 1, 2014	Machine Translationspeech-recognition	—Unverified
Using a machine learning model to assess the complexity of stress systems	May 1, 2014	BIG-bench Machine LearningSpeech Synthesis	—Unverified
Designing Language Technology Applications: A Wizard of Oz Driven Prototyping Framework	Apr 1, 2014	Machine TranslationSpeech Recognition	—Unverified
Predicting Romanian Stress Assignment	Apr 1, 2014	Speech SynthesisText-To-Speech Synthesis	—Unverified
HMM-based Mandarin Singing Voice Synthesis Using Tailored Synthesis Units and Question Sets	Dec 1, 2013	Singing Voice SynthesisSpeech Synthesis	—Unverified
Development of Marathi Part of Speech Tagger Using Statistical Approach	Oct 2, 2013	Information RetrievalPart-Of-Speech Tagging	—Unverified
Fast Bootstrapping of Grapheme to Phoneme System for Under-resourced Languages - Application to the Iban Language	Oct 1, 2013	Speech RecognitionSpeech Synthesis	—Unverified
Russian Stress Prediction using Maximum Entropy Ranking	Oct 1, 2013	Machine TranslationPrediction	—Unverified
Multi-step Natural Language Understanding	Aug 1, 2013	Natural Language UnderstandingSpeech Recognition	—Unverified
WebWOZ: A Platform for Designing and Conducting Web-based Wizard of Oz Experiments	Aug 1, 2013	Machine TranslationSpeech Recognition	—Unverified
Adaptive Parser-Centric Text Normalization	Aug 1, 2013	Machine TranslationSpeech Recognition	—Unverified
Large tagset labeling using Feed Forward Neural Networks. Case study on Romanian Language	Aug 1, 2013	Machine TranslationPart-Of-Speech Tagging	—Unverified
Punjabi Text-To-Speech Synthesis System	Dec 1, 2012	Speech Synthesistext-to-speech	—Unverified
Automatically Acquiring Fine-Grained Information Status Distinctions in German	Jul 1, 2012	Coreference ResolutionSpeech Synthesis	—Unverified
Vers une annotation automatique de corpus audio pour la synth\`ese de parole (Towards Fully Automatic Annotation of Audio Books for Text-To-Speech (TTS) Synthesis) [in French]	Jun 1, 2012	Speech Synthesistext-to-speech	—Unverified
Variations prosodiques en synth\`ese par s\'election d'unit\'es: l'exemple des phrases interrogatives (Prosodic variations in unit-based speech synthesis: the example of interrogative sentences) [in French]	Jun 1, 2012	Speech SynthesisText-To-Speech Synthesis	—Unverified
Leveraging supplemental representations for sequential transduction	Jun 1, 2012	Speech SynthesisText-To-Speech Synthesis	—Unverified
Real-time Incremental Speech-to-Speech Translation of Dialogs	Jun 1, 2012	Machine TranslationSpeech Recognition	—Unverified
Designing French Tale Corpora for Entertaining Text To Speech Synthesis	May 1, 2012	SentenceSpeech Synthesis	—Unverified
LDC Forced Aligner	May 1, 2012	SentenceSpeech Recognition	—Unverified
BUCEADOR, a multi-language search engine for digital libraries	May 1, 2012	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Building Text-to-Speech Systems for Resource Poor Languages	May 1, 2012	ClusteringSpeech Synthesis	—Unverified
Learning Sentiment Lexicons in Spanish	May 1, 2012	Opinion MiningQuestion Answering	—Unverified
Texto4Science: a Quebec French Database of Annotated Short Text Messages	May 1, 2012	Speech SynthesisText-To-Speech Synthesis	—Unverified
Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis	May 1, 2012	Audio-Visual Speech RecognitionSpeech Recognition	—Unverified
Open-Source Boundary-Annotated Corpus for Arabic Speech and Language Processing	May 1, 2012	ChunkingDescriptive	—Unverified
Towards Fully Automatic Annotation of Audio Books for TTS	May 1, 2012	Speech RecognitionSpeech Synthesis	—Unverified
BAD: An Assistant tool for making verses in Basque	Apr 1, 2012	Speech SynthesisText-To-Speech Synthesis	—Unverified

Show:10 25 50

← PrevPage 7 of 7Next →

All datasets LJSpeech 20000 utterances CMUDict 0.7b HUI speech corpus Thorsten voice 21.02 neutral Trinity Speech-Gesture Dataset

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NaturalSpeech	Audio Quality MOS	4.56	—	Unverified
2	VITS	Audio Quality MOS	4.43	—	Unverified
3	Grad-TTS + HiFiGAN (1000 steps)	Audio Quality MOS	4.37	—	Unverified
4	FastSpeech 2 + HiFiGAN	Audio Quality MOS	4.34	—	Unverified
5	Glow-TTS + HiFiGAN	Audio Quality MOS	4.34	—	Unverified
6	FastSpeech 2 + HiFiGAN	Audio Quality MOS	4.32	—	Unverified
7	FastDiff (4 steps)	Audio Quality MOS	4.28	—	Unverified
8	FastDiff-TTS	Audio Quality MOS	4.03	—	Unverified
9	Transformer TTS (Mel + WaveGlow)	Audio Quality MOS	3.88	—	Unverified
10	FastSpeech (Mel + WaveGlow)	Audio Quality MOS	3.84	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Mia	10-keyword Speech Commands dataset	16	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Token-Level Ensemble Distillation	Phoneme Error Rate	4.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Tacotron 2	Mean Opinion Score	3.74	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Tacotron 2	Mean Opinion Score	3.49	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Match-TTSG	MOS	3.7	—	Unverified