Speech Synthesis

Speech synthesis is the task of generating speech from some other modality like text, lip movements etc.

Please note that the leaderboards here are not really comparable between studies - as they use mean opinion score as a metric and collect different samples from Amazon Mechnical Turk.

( Image credit: WaveNet: A generative model for raw audio )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 601–610 of 1249 papers

Title	Date	Tasks	Status
Individuality-Preserving Spectrum Modification for Articulation Disorders Using Phone Selective Synthesis	Sep 1, 2015	Speech SynthesisText-To-Speech Synthesis	—Unverified
Improving homograph disambiguation with supervised machine learning	May 1, 2018	BIG-bench Machine LearningSpeech Synthesis	—Unverified
INPRO\_iSS: A Component for Just-In-Time Incremental Speech Synthesis	Jul 1, 2012	Speech SynthesisSpoken Dialogue Systems	—Unverified
CrossSpeech: Speaker-independent Acoustic Representation for Cross-lingual Speech Synthesis	Feb 28, 2023	Speech Synthesistext-to-speech	—Unverified
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation	Dec 28, 2024	Speech Synthesis	—Unverified
Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)	Jan 26, 2016	General Knowledgespeech-recognition	—Unverified
Improving Cross-lingual Speech Synthesis with Triplet Training Scheme	Feb 22, 2022	Speech Synthesistext-to-speech	—Unverified
Augmented Prompt Selection for Evaluation of Spontaneous Speech Synthesis	May 1, 2020	Speech Synthesis	—Unverified
Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects	Aug 2, 2018	DenoisingSpeech Synthesis	—Unverified
Audio-visual video-to-speech synthesis with synthesized input audio	Jul 31, 2023	Speech Synthesis	—Unverified

Show:10 25 50

← PrevPage 61 of 125Next →

All datasets LibriTTS North American English LJSpeech Mandarin Chinese Blizzard Challenge 2013

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PeriodWave-Turbo-L	PESQ	4.45	—	Unverified
2	BigVGAN-v2	PESQ	4.36	—	Unverified
3	EVA-GAN-big	PESQ	4.35	—	Unverified
4	PeriodWave + FreeU	PESQ	4.25	—	Unverified
5	RFWave	PESQ	4.23	—	Unverified
6	BigVSAN (w/ snakebeta)	PESQ	4.12	—	Unverified
7	BigVSAN	PESQ	4.12	—	Unverified
8	EVA-GAN-base	PESQ	4.03	—	Unverified
9	BigVGAN	PESQ	4.03	—	Unverified
10	Vocos	PESQ	3.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Tacotron 2	Mean Opinion Score	4.53	—	Unverified
2	WaveNet (Linguistic)	Mean Opinion Score	4.34	—	Unverified
3	WaveNet (L+F)	Mean Opinion Score	4.21	—	Unverified
4	Tacotron	Mean Opinion Score	4	—	Unverified
5	HMM-driven concatenative	Mean Opinion Score	3.86	—	Unverified
6	LSTM-RNN parametric	Mean Opinion Score	3.67	—	Unverified
7	means	Mean Opinion Score	0	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BDDM vocoder	Mean Opinion Score	4.48	—	Unverified
2	DiffWave LARGE	Mean Opinion Score	4.44	—	Unverified
3	Neural HMM	Mean Opinion Score	3.24	—	Unverified
4	Neural HMM Ablation with 1 state per phone	Mean Opinion Score	2.68	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaveNet (L+F)	Mean Opinion Score	4.08	—	Unverified
2	LSTM-RNN parametric	Mean Opinion Score	3.79	—	Unverified
3	HMM-driven concatenative	Mean Opinion Score	3.47	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SampleRNN (2-tier)	NLL	1.39	—	Unverified
2	SampleRNN (3-tier)	NLL	1.39	—	Unverified