Speech Synthesis

Speech synthesis is the task of generating speech from some other modality like text, lip movements etc.

Please note that the leaderboards here are not really comparable between studies - as they use mean opinion score as a metric and collect different samples from Amazon Mechnical Turk.

( Image credit: WaveNet: A generative model for raw audio )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1151–1175 of 1249 papers

Title	Date	Tasks	Status
An In-depth Analysis of the Effect of Text Normalization in Social Media	May 1, 2015	Dependency Parsingnamed-entity-recognition	—Unverified
Normalization of Non-Standard Words in Croatian Texts	Mar 27, 2015	FormGeneral Classification	—Unverified
F0 Modeling In Hmm-Based Speech Synthesis System Using Deep Belief Network	Feb 18, 2015	ClusteringSpeaker Verification	—Unverified
Sequence Modeling using Gated Recurrent Neural Networks	Jan 1, 2015	Machine TranslationSpeech Synthesis	—Unverified
Duration Modeling by Multi-Models based on Vowel Production characteristics	Dec 1, 2014	Speech SynthesisText-To-Speech Synthesis	—Unverified
Emphasized Accent Phrase Prediction from Text for Advertisement Text-To-Speech Synthesis	Dec 1, 2014	Speech Synthesistext-to-speech	—Unverified
基於發音知識以建構頻譜HMM 之國語語音合成方法 (A Mandarin Speech Synthesis Method Using Articulation-knowledge Based Spectral HMM Structure)[In Chinese]	Oct 1, 2014	Speech Synthesis	—Unverified
A Deep Learning Approach to Data-driven Parameterizations for Statistical Parametric Speech Synthesis	Sep 30, 2014	DenoisingSpeech Synthesis	—Unverified
An Algorithm Based on Empirical Methods, for Text-to-Tuneful-Speech Synthesis of Sanskrit Verse	Sep 15, 2014	Speech Synthesistext-to-speech	—Unverified
Speech earthquakes: scaling and universality in human voice	Aug 5, 2014	Speech Synthesis	—Unverified
Situated Incremental Natural Language Understanding using a Multimodal, Linguistically-driven Update Model	Aug 1, 2014	Dialogue ManagementNatural Language Understanding	—Unverified
Learning to Summarise Related Sentences	Aug 1, 2014	Question AnsweringSentence Compression	—Unverified
A Bengali HMM Based Speech Synthesis System	Jun 16, 2014	Speech Synthesistext-to-speech	—Unverified
Hippocratic Abbreviation Expansion	Jun 1, 2014	Information RetrievalMachine Translation	—Unverified
MVA: The Multimodal Virtual Assistant	Jun 1, 2014	Speech RecognitionSpeech Synthesis	—Unverified
An Expert System for Automatic Reading of A Text Written in Standard Arabic	May 8, 2014	Speech Synthesistext-to-speech	—Unverified
RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus	May 1, 2014	Speech Synthesistext-to-speech	—Unverified
Casa de la Lh\'engua: a set of language resources and natural language processing tools for Mirandese	May 1, 2014	POSPOS Tagging	—Unverified
Designing the Latvian Speech Recognition Corpus	May 1, 2014	speech-recognitionSpeech Recognition	—Unverified
Alert!... Calm Down, There is Nothing to Worry About. Warning and Soothing Speech Synthesis.	May 1, 2014	Expressive Speech SynthesisSentence	—Unverified
The MMASCS multi-modal annotated synchronous corpus of audio, video, facial motion and tongue motion data of normal, fast and slow speech	May 1, 2014	Speech Synthesis	—Unverified
Using a machine learning model to assess the complexity of stress systems	May 1, 2014	BIG-bench Machine LearningSpeech Synthesis	—Unverified
GlobalPhone: Pronunciation Dictionaries in 20 Languages	May 1, 2014	Language IdentificationLanguage Modelling	—Unverified
Using Audio Books for Training a Text-to-Speech System	May 1, 2014	DiversitySpeech Synthesis	—Unverified
HESITA(te) in Portuguese	May 1, 2014	Acoustic ModellingAutomatic Speech Recognition	—Unverified

Show:10 25 50

← PrevPage 47 of 50Next →

All datasets LibriTTS North American English LJSpeech Mandarin Chinese Blizzard Challenge 2013

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PeriodWave-Turbo-L	PESQ	4.45	—	Unverified
2	BigVGAN-v2	PESQ	4.36	—	Unverified
3	EVA-GAN-big	PESQ	4.35	—	Unverified
4	PeriodWave + FreeU	PESQ	4.25	—	Unverified
5	RFWave	PESQ	4.23	—	Unverified
6	BigVSAN (w/ snakebeta)	PESQ	4.12	—	Unverified
7	BigVSAN	PESQ	4.12	—	Unverified
8	EVA-GAN-base	PESQ	4.03	—	Unverified
9	BigVGAN	PESQ	4.03	—	Unverified
10	Vocos	PESQ	3.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Tacotron 2	Mean Opinion Score	4.53	—	Unverified
2	WaveNet (Linguistic)	Mean Opinion Score	4.34	—	Unverified
3	WaveNet (L+F)	Mean Opinion Score	4.21	—	Unverified
4	Tacotron	Mean Opinion Score	4	—	Unverified
5	HMM-driven concatenative	Mean Opinion Score	3.86	—	Unverified
6	LSTM-RNN parametric	Mean Opinion Score	3.67	—	Unverified
7	means	Mean Opinion Score	0	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BDDM vocoder	Mean Opinion Score	4.48	—	Unverified
2	DiffWave LARGE	Mean Opinion Score	4.44	—	Unverified
3	Neural HMM	Mean Opinion Score	3.24	—	Unverified
4	Neural HMM Ablation with 1 state per phone	Mean Opinion Score	2.68	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaveNet (L+F)	Mean Opinion Score	4.08	—	Unverified
2	LSTM-RNN parametric	Mean Opinion Score	3.79	—	Unverified
3	HMM-driven concatenative	Mean Opinion Score	3.47	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SampleRNN (2-tier)	NLL	1.39	—	Unverified
2	SampleRNN (3-tier)	NLL	1.39	—	Unverified