Speech Synthesis

Speech synthesis is the task of generating speech from some other modality like text, lip movements etc.

Please note that the leaderboards here are not really comparable between studies - as they use mean opinion score as a metric and collect different samples from Amazon Mechnical Turk.

( Image credit: WaveNet: A generative model for raw audio )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1131–1140 of 1249 papers

Title	Date	Tasks	Status
Investigating gated recurrent neural networks for speech synthesis	Jan 11, 2016	Speech Synthesis	—Unverified
Minimally Supervised Number Normalization	Jan 1, 2016	speech-recognitionSpeech Recognition	—Unverified
Text Normalization and Unit Selection for a Memory Based Non Uniform Unit Selection TTS in Malayalam	Dec 1, 2015	Speech SynthesisText Normalization	—Unverified
Automatic Prosody Prediction for Chinese Speech Synthesis using BLSTM-RNN and Embedding Features	Nov 2, 2015	Feature EngineeringProsody Prediction	—Unverified
Hierarchical Representation of Prosody for Statistical Speech Synthesis	Oct 7, 2015	Speech Synthesistext-to-speech	—Unverified
A Waveform Representation Framework for High-quality Statistical Parametric Speech Synthesis	Oct 6, 2015	Speech SynthesisVocal Bursts Intensity Prediction	—Unverified
結合ANN、全域變異數與真實軌跡挑選之基週軌跡產生方法(A Pitch-contour Generation Method Combining ANN Prediction,Global Variance Matching, and Real-contour Selection)[In Chinese]	Oct 1, 2015	Speech Synthesis	—Unverified
A Comparison of Manual and Automatic Voice Repair for Individual with Vocal Disabilities	Sep 1, 2015	Speech Synthesis	—Unverified
Individuality-Preserving Spectrum Modification for Articulation Disorders Using Phone Selective Synthesis	Sep 1, 2015	Speech SynthesisText-To-Speech Synthesis	—Unverified
Incremental Coordination: Attention-Centric Speech Production in a Physically Situated Conversational Agent	Sep 1, 2015	Speech Synthesis	—Unverified

Show:10 25 50

← PrevPage 114 of 125Next →

All datasets LibriTTS North American English LJSpeech Mandarin Chinese Blizzard Challenge 2013

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PeriodWave-Turbo-L	PESQ	4.45	—	Unverified
2	BigVGAN-v2	PESQ	4.36	—	Unverified
3	EVA-GAN-big	PESQ	4.35	—	Unverified
4	PeriodWave + FreeU	PESQ	4.25	—	Unverified
5	RFWave	PESQ	4.23	—	Unverified
6	BigVSAN (w/ snakebeta)	PESQ	4.12	—	Unverified
7	BigVSAN	PESQ	4.12	—	Unverified
8	EVA-GAN-base	PESQ	4.03	—	Unverified
9	BigVGAN	PESQ	4.03	—	Unverified
10	Vocos	PESQ	3.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Tacotron 2	Mean Opinion Score	4.53	—	Unverified
2	WaveNet (Linguistic)	Mean Opinion Score	4.34	—	Unverified
3	WaveNet (L+F)	Mean Opinion Score	4.21	—	Unverified
4	Tacotron	Mean Opinion Score	4	—	Unverified
5	HMM-driven concatenative	Mean Opinion Score	3.86	—	Unverified
6	LSTM-RNN parametric	Mean Opinion Score	3.67	—	Unverified
7	means	Mean Opinion Score	0	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BDDM vocoder	Mean Opinion Score	4.48	—	Unverified
2	DiffWave LARGE	Mean Opinion Score	4.44	—	Unverified
3	Neural HMM	Mean Opinion Score	3.24	—	Unverified
4	Neural HMM Ablation with 1 state per phone	Mean Opinion Score	2.68	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaveNet (L+F)	Mean Opinion Score	4.08	—	Unverified
2	LSTM-RNN parametric	Mean Opinion Score	3.79	—	Unverified
3	HMM-driven concatenative	Mean Opinion Score	3.47	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SampleRNN (2-tier)	NLL	1.39	—	Unverified
2	SampleRNN (3-tier)	NLL	1.39	—	Unverified