SOTAVerified

Fine-grained Morphosyntactic Analysis and Generation Tools for More Than One Thousand Languages

2020-05-01LREC 2020Unverified0· sign in to hype

Garrett Nicolai, Dylan Lewis, Arya D. McCarthy, Aaron Mueller, Winston Wu, David Yarowsky

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Exploiting the broad translation of the Bible into the world's languages, we train and distribute morphosyntactic tools for approximately one thousand languages, vastly outstripping previous distributions of tools devoted to the processing of inflectional morphology. Evaluation of the tools on a subset of available inflectional dictionaries demonstrates strong initial models, supplemented and improved through ensembling and dictionary-based reranking. Likewise, a novel type-to-token based evaluation metric allows us to confirm that models generalize well across rare and common forms alike

Tasks

Reproductions