Wiktionary Normalization of Translations and Morphological Information
2020-12-01COLING 2020Unverified0· sign in to hype
Winston Wu, David Yarowsky
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We extend the Yawipa Wiktionary Parser (Wu and Yarowsky, 2020) to extract and normalize translations from etymology glosses, and morphological form-of relations, resulting in 300K unique translations and over 4 million instances of 168 annotated morphological relations. We propose a method to identify typos in translation annotations. Using the extracted morphological data, we develop multilingual neural models for predicting three types of word formation---clipping, contraction, and eye dialect---and improve upon a standard attention baseline by using copy attention.