SOTAVerified

Development of Mandarin-English code-switching speech synthesis system

2022-11-01ROCLING 2022Unverified0· sign in to hype

Hsin-Jou Lien, Li-Yu Huang, Chia-Ping Chen

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, the Mandarin-English code-switching speech synthesis system has been proposed. To focus on learning the content information between two languages, the training dataset is multilingual artificial dataset whose speaker style is unified. Adding language embedding into the system helps it be more adaptive to multilingual dataset. Besides, text preprocessing is applied and be used in different way which depends on the languages. Word segmentation and text-to-pinyin are the text preprocessing for Mandarin, which not only improves the fluency but also reduces the learning complexity. Number normalization decides whether the arabic numerals in sentence needs to add the digits. The preprocessing for English is acronym conversion which decides the pronunciation of acronym.

Tasks

Reproductions