Investigating variation in written forms of Nahuatl using character-based language models
2021-06-01NAACL (AmericasNLP) 2021Code Available0· sign in to hype
Robert Pugh, Francis Tyers
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/lguyogiro/nahuatl-variant-charlms-americasnlpOfficialIn papernone★ 2
Abstract
We describe experiments with character-based language modeling for written variants of Nahuatl. Using a standard LSTM model and publicly available Bible translations, we explore how character language models can be applied to the tasks of estimating mutual intelligibility, identifying genetic similarity, and distinguishing written variants. We demonstrate that these simple language models are able to capture similarities and differences that have been described in the linguistic literature.