Automatically Identifying Language Family from Acoustic Examples in Low Resource Scenarios
2020-12-01Code Available0· sign in to hype
Peter Wu, Yifan Zhong, Alan W Black
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/peter-yh-wu/multilingualOfficialIn papernone★ 1
Abstract
Existing multilingual speech NLP works focus on a relatively small subset of languages, and thus current linguistic understanding of languages predominantly stems from classical approaches. In this work, we propose a method to analyze language similarity using deep learning. Namely, we train a model on the Wilderness dataset and investigate how its latent space compares with classical language family findings. Our approach provides a new direction for cross-lingual data augmentation in any speech-based NLP task.