SOTAVerified

Investigating variation in written forms of Nahuatl using character-based language models

2021-06-01NAACL (AmericasNLP) 2021Code Available0· sign in to hype

Robert Pugh, Francis Tyers

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We describe experiments with character-based language modeling for written variants of Nahuatl. Using a standard LSTM model and publicly available Bible translations, we explore how character language models can be applied to the tasks of estimating mutual intelligibility, identifying genetic similarity, and distinguishing written variants. We demonstrate that these simple language models are able to capture similarities and differences that have been described in the linguistic literature.

Tasks

Reproductions