SOTAVerified

A Corpus of Neutral Voice Speech in Brazilian Portuguese

2021-05-21International Conference on Computational Processing of the Portuguese Language 2021Unverified0· sign in to hype

Pedro H. L. Leite, Edmundo Hoyle, Álvaro Antelo, Luiz F. Kruszielski, Luiz W. P. Biscainho

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This work presents a new database containing high sampling rate recordings of a single male speaker reading sentences in Brazilian Portuguese with neutral voice, along with the corresponding text corpus. Intended for synthesis and other speech-oriented applications, the dataset contains text scripts extracted from a popular Brazilian news TV program, read out loud by a trained individual in a controlled environment, resulting in roughly 20 h of audio data. The text was normalized in the recording process and special textual occurrences (e.g. acronyms, numbers, foreign names etc.) were replaced by their phonetic translation to a readable text in Portuguese. There are no noticeable accidental sounds and background noise has been kept to a minimum in all audio samples. To illustrate the potential benefits of having this data available, text-to-speech experiments were conducted using state-of-the-art models for speech synthesis (Tacotron 2 and Waveglow). As a result, we obtained intelligible and natural sounding voices from as few as 8 min of audio samples coming from an unseen target speaker, after having trained over our data; moreover, by increasing the target recording time to 75 min, we have noticeably improved accuracy in pronunciation.

Tasks

Reproductions