SOTAVerified

Dealing with unknown words in statistical machine translation

2012-05-01LREC 2012Unverified0· sign in to hype

Jo{\~a}o Silva, Lu{\'\i}sa Coheur, {\^A}ngela Costa, Isabel Trancoso

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In Statistical Machine Translation, words that were not seen during training are unknown words, that is, words that the system will not know how to translate. In this paper we contribute to this research problem by profiting from orthographic cues given by words. Thus, we report a study of the impact of word distance metrics in cognates' detection and, in addition, on the possibility of obtaining possible translations of unknown words through Logical Analogy. Our approach is tested in the translation of corpora from Portuguese to English (and vice-versa).

Tasks

Reproductions