Disentangling dialects: a neural approach to Indo-Aryan historical phonology and subgrouping
2020-11-01CONLLCode Available0· sign in to hype
Chundra Cathcart, Taraka Rama
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/chundrac/ia-conll-2020OfficialIn papertf★ 1
Abstract
This paper seeks to uncover patterns of sound change across Indo-Aryan languages using an LSTM encoder-decoder architecture. We augment our models with embeddings represent-ing language ID, part of speech, and other features such as word embeddings. We find that a highly augmented model shows highest accuracy in predicting held-out forms, and investigate other properties of interest learned by our models' representations. We outline extensions to this architecture that can better capture variation in Indo-Aryan sound change.