LSTM Autoencoders for Dialect Analysis

2016-12-01WS 2016Unverified0· sign in to hype

Taraka Rama, {\c{C}}a{\u{g}}r{\i} {\c{C}}{\"o}ltekin

Unverified — Be the first to reproduce this paper.

Abstract

Computational approaches for dialectometry employed Levenshtein distance to compute an aggregate similarity between two dialects belonging to a single language group. In this paper, we apply a sequence-to-sequence autoencoder to learn a deep representation for words that can be used for meaningful comparison across dialects. In contrast to the alignment-based methods, our method does not require explicit alignments. We apply our architectures to three different datasets and show that the learned representations indicate highly similar results with the analyses based on Levenshtein distance and capture the traditional dialectal differences shown by dialectologists.

Tasks

Dimensionality Reduction

LSTM Autoencoders for Dialect Analysis

Abstract

Tasks

Reproductions