Non-Linearity in Mapping Based Cross-Lingual Word Embeddings

2020-05-01LREC 2020Unverified0· sign in to hype

Jia-Wei Zhao, Andrew Gilman

Unverified — Be the first to reproduce this paper.

Abstract

Recent works on cross-lingual word embeddings have been mainly focused on linear-mapping-based approaches, where pre-trained word embeddings are mapped into a shared vector space using a linear transformation. However, there is a limitation in such approaches--they follow a key assumption: words with similar meanings share similar geometric arrangements between their monolingual word embeddings, which suggest that there is a linear relationship between languages. However, such assumption may not hold for all language pairs across all semantic concepts. We investigate whether non-linear mappings can better describe the relationship between different languages by utilising kernel Canonical Correlation Analysis (KCCA). Experimental results on five language pairs show an improvement over current state-of-art results in both supervised and self-learning scenarios, confirming that non-linear mapping is a better way to describe the relationship between languages.

Tasks

Cross-Lingual Word Embeddings Self-Learning Word Embeddings

Non-Linearity in Mapping Based Cross-Lingual Word Embeddings

Abstract

Tasks

Reproductions