Discriminating Between Similar Nordic Languages
2020-12-11EACL (VarDial) 2021Code Available0· sign in to hype
René Haas, Leon Derczynski
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/StrombergNLP/NordicDSLOfficialnone★ 1
- github.com/renhaa/NordicDSLIn papernone★ 6
Abstract
Automatic language identification is a challenging problem. Discriminating between closely related languages is especially difficult. This paper presents a machine learning approach for automatic language identification for the Nordic languages, which often suffer miscategorisation by existing state-of-the-art tools. Concretely we will focus on discrimination between six Nordic languages: Danish, Swedish, Norwegian (Nynorsk), Norwegian (Bokm l), Faroese and Icelandic.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Nordic Language Identification | FastText | Accuracy | 0.97 | — | Unverified |