Speech Synthesis

Speech synthesis is the task of generating speech from some other modality like text, lip movements etc.

Please note that the leaderboards here are not really comparable between studies - as they use mean opinion score as a metric and collect different samples from Amazon Mechnical Turk.

( Image credit: WaveNet: A generative model for raw audio )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1151–1200 of 1249 papers

Title	Date	Tasks	Status
An In-depth Analysis of the Effect of Text Normalization in Social Media	May 1, 2015	Dependency Parsingnamed-entity-recognition	—Unverified
Normalization of Non-Standard Words in Croatian Texts	Mar 27, 2015	FormGeneral Classification	—Unverified
F0 Modeling In Hmm-Based Speech Synthesis System Using Deep Belief Network	Feb 18, 2015	ClusteringSpeaker Verification	—Unverified
Sequence Modeling using Gated Recurrent Neural Networks	Jan 1, 2015	Machine TranslationSpeech Synthesis	—Unverified
Duration Modeling by Multi-Models based on Vowel Production characteristics	Dec 1, 2014	Speech SynthesisText-To-Speech Synthesis	—Unverified
Emphasized Accent Phrase Prediction from Text for Advertisement Text-To-Speech Synthesis	Dec 1, 2014	Speech Synthesistext-to-speech	—Unverified
基於發音知識以建構頻譜HMM 之國語語音合成方法 (A Mandarin Speech Synthesis Method Using Articulation-knowledge Based Spectral HMM Structure)[In Chinese]	Oct 1, 2014	Speech Synthesis	—Unverified
A Deep Learning Approach to Data-driven Parameterizations for Statistical Parametric Speech Synthesis	Sep 30, 2014	DenoisingSpeech Synthesis	—Unverified
An Algorithm Based on Empirical Methods, for Text-to-Tuneful-Speech Synthesis of Sanskrit Verse	Sep 15, 2014	Speech Synthesistext-to-speech	—Unverified
Speech earthquakes: scaling and universality in human voice	Aug 5, 2014	Speech Synthesis	—Unverified
Situated Incremental Natural Language Understanding using a Multimodal, Linguistically-driven Update Model	Aug 1, 2014	Dialogue ManagementNatural Language Understanding	—Unverified
Learning to Summarise Related Sentences	Aug 1, 2014	Question AnsweringSentence Compression	—Unverified
A Bengali HMM Based Speech Synthesis System	Jun 16, 2014	Speech Synthesistext-to-speech	—Unverified
Hippocratic Abbreviation Expansion	Jun 1, 2014	Information RetrievalMachine Translation	—Unverified
MVA: The Multimodal Virtual Assistant	Jun 1, 2014	Speech RecognitionSpeech Synthesis	—Unverified
An Expert System for Automatic Reading of A Text Written in Standard Arabic	May 8, 2014	Speech Synthesistext-to-speech	—Unverified
RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus	May 1, 2014	Speech Synthesistext-to-speech	—Unverified
Casa de la Lh\'engua: a set of language resources and natural language processing tools for Mirandese	May 1, 2014	POSPOS Tagging	—Unverified
Designing the Latvian Speech Recognition Corpus	May 1, 2014	speech-recognitionSpeech Recognition	—Unverified
Alert!... Calm Down, There is Nothing to Worry About. Warning and Soothing Speech Synthesis.	May 1, 2014	Expressive Speech SynthesisSentence	—Unverified
The MMASCS multi-modal annotated synchronous corpus of audio, video, facial motion and tongue motion data of normal, fast and slow speech	May 1, 2014	Speech Synthesis	—Unverified
Using a machine learning model to assess the complexity of stress systems	May 1, 2014	BIG-bench Machine LearningSpeech Synthesis	—Unverified
GlobalPhone: Pronunciation Dictionaries in 20 Languages	May 1, 2014	Language IdentificationLanguage Modelling	—Unverified
Using Audio Books for Training a Text-to-Speech System	May 1, 2014	DiversitySpeech Synthesis	—Unverified
HESITA(te) in Portuguese	May 1, 2014	Acoustic ModellingAutomatic Speech Recognition	—Unverified
Computer-Aided Quality Assurance of an Icelandic Pronunciation Dictionary	May 1, 2014	speech-recognitionSpeech Recognition	—Unverified
Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and A Network-based ASR System	May 1, 2014	Machine Translationspeech-recognition	—Unverified
A Conventional Orthography for Tunisian Arabic	May 1, 2014	Language ModellingMachine Translation	—Unverified
The Development of the Multilingual LUNA Corpus for Spoken Language System Porting	May 1, 2014	Machine TranslationSpeech Synthesis	—Unverified
The AV-LASYN Database : A synchronous corpus of audio and 3D facial marker data for audio-visual laughter synthesis	May 1, 2014	Dimensionality ReductionSpeech Synthesis	—Unverified
Predicting Romanian Stress Assignment	Apr 1, 2014	Speech SynthesisText-To-Speech Synthesis	—Unverified
From Speaker Identification to Affective Analysis: A Multi-Step System for Analyzing Children's Stories	Apr 1, 2014	Age EstimationSpeaker Identification	—Unverified
Designing Language Technology Applications: A Wizard of Oz Driven Prototyping Framework	Apr 1, 2014	Machine TranslationSpeech Recognition	—Unverified
Auto Spell Suggestion for High Quality Speech Synthesis in Hindi	Feb 15, 2014	Speech Synthesistext-to-speech	—Unverified
HMM-based Mandarin Singing Voice Synthesis Using Tailored Synthesis Units and Question Sets	Dec 1, 2013	Singing Voice SynthesisSpeech Synthesis	—Unverified
Development of Marathi Part of Speech Tagger Using Statistical Approach	Oct 2, 2013	Information RetrievalPart-Of-Speech Tagging	—Unverified
Response Generation Based on Hierarchical Semantic Structure with POMDP Re-ranking for Conversational Dialogue Systems	Oct 1, 2013	Dialogue ManagementInformation Retrieval	—Unverified
Russian Stress Prediction using Maximum Entropy Ranking	Oct 1, 2013	Machine TranslationPrediction	—Unverified
Fast Bootstrapping of Grapheme to Phoneme System for Under-resourced Languages - Application to the Iban Language	Oct 1, 2013	Speech RecognitionSpeech Synthesis	—Unverified
A unified lexical processing framework based on the Margin Infused Relaxed Algorithm. A case study on the Romanian Language	Sep 1, 2013	LemmatizationSpeech Synthesis	—Unverified
Open-ended, Extensible System Utterances Are Preferred, Even If They Require Filled Pauses	Aug 1, 2013	Speech SynthesisSpoken Dialogue Systems	—Unverified
Towards Personalised Synthesised Voices for Individuals with Vocal Disabilities: Voice Banking and Reconstruction	Aug 1, 2013	Speech Synthesis	—Unverified
A Robotic Agent in a Virtual Environment that Performs Situated Incremental Understanding of Navigational Utterances	Aug 1, 2013	Language ModellingSpeech Recognition	—Unverified
Multi-step Natural Language Understanding	Aug 1, 2013	Natural Language UnderstandingSpeech Recognition	—Unverified
The dramatic piece reader for the blind and visually impaired	Aug 1, 2013	Speech Synthesis	—Unverified
Large tagset labeling using Feed Forward Neural Networks. Case study on Romanian Language	Aug 1, 2013	Machine TranslationPart-Of-Speech Tagging	—Unverified
Is word-to-phone mapping better than phone-phone mapping for handling English words?	Aug 1, 2013	Speech Synthesis	—Unverified
POS-Tag Based Poetry Generation with WordNet	Aug 1, 2013	POSSpeech Synthesis	—Unverified
Adaptive Parser-Centric Text Normalization	Aug 1, 2013	Machine TranslationSpeech Recognition	—Unverified
WebWOZ: A Platform for Designing and Conducting Web-based Wizard of Oz Experiments	Aug 1, 2013	Machine TranslationSpeech Recognition	—Unverified

Show:10 25 50

← PrevPage 24 of 25Next →

All datasets LibriTTS North American English LJSpeech Mandarin Chinese Blizzard Challenge 2013

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PeriodWave-Turbo-L	PESQ	4.45	—	Unverified
2	BigVGAN-v2	PESQ	4.36	—	Unverified
3	EVA-GAN-big	PESQ	4.35	—	Unverified
4	PeriodWave + FreeU	PESQ	4.25	—	Unverified
5	RFWave	PESQ	4.23	—	Unverified
6	BigVSAN (w/ snakebeta)	PESQ	4.12	—	Unverified
7	BigVSAN	PESQ	4.12	—	Unverified
8	EVA-GAN-base	PESQ	4.03	—	Unverified
9	BigVGAN	PESQ	4.03	—	Unverified
10	Vocos	PESQ	3.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Tacotron 2	Mean Opinion Score	4.53	—	Unverified
2	WaveNet (Linguistic)	Mean Opinion Score	4.34	—	Unverified
3	WaveNet (L+F)	Mean Opinion Score	4.21	—	Unverified
4	Tacotron	Mean Opinion Score	4	—	Unverified
5	HMM-driven concatenative	Mean Opinion Score	3.86	—	Unverified
6	LSTM-RNN parametric	Mean Opinion Score	3.67	—	Unverified
7	means	Mean Opinion Score	0	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BDDM vocoder	Mean Opinion Score	4.48	—	Unverified
2	DiffWave LARGE	Mean Opinion Score	4.44	—	Unverified
3	Neural HMM	Mean Opinion Score	3.24	—	Unverified
4	Neural HMM Ablation with 1 state per phone	Mean Opinion Score	2.68	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaveNet (L+F)	Mean Opinion Score	4.08	—	Unverified
2	LSTM-RNN parametric	Mean Opinion Score	3.79	—	Unverified
3	HMM-driven concatenative	Mean Opinion Score	3.47	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SampleRNN (2-tier)	NLL	1.39	—	Unverified
2	SampleRNN (3-tier)	NLL	1.39	—	Unverified