SOTAVerified

Better Translation for Vietnamese

2021-04-20Code Available1· sign in to hype

Chinh Ngo, Trieu Trinh

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We collect data from open sources on the Internet, and classify them into different categories, each labeled with a specific language style 3. In total, there are 3.3 million pairs of English and Vietnamese texts, ranging from single sentences to paragraphs. A model trained with our dataset outperforms Google Translate on a selected set of diverse text sources. On IWSLT'15 we achieved a BLEU score of 37.84.

Tasks

Reproductions