SOTAVerified

German Dialect Identification and Mapping for Preservation and Recovery

2022-06-01EURALI (LREC) 2022Unverified0· sign in to hype

Aynalem Tesfaye Misganaw, Sabine Roller

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Many linguistic projects which focus on dialects do collection of audio data, analysis, and linguistic interpretation on the data. The outcomes of such projects are good language resources because dialects are among less-resources languages as most of them are oral traditions. Our project Dialektatlas Mittleres Westdeutschland (DMW) 1 focuses on the study of German language varieties through collection of audio data of words and phrases which are selected by linguistic experts based on the linguistic significance of the words (and phrases) to distinguish dialects among each other. We used a total of 7,814 audio snippets of the words and phrases of eight different dialects from middle west Germany. We employed a multilabel classification approach to address the problem of dialect mapping using Support Vector Machine (SVM) algorithm. The experimental result showed a promising accuracy of 87%.

Tasks

Reproductions