Shape of Elephant: Study of Macro Properties of Word Embeddings Spaces

2021-06-13Unverified0· sign in to hype

Alexey Tikhonov

Unverified — Be the first to reproduce this paper.

Abstract

Pre-trained word representations became a key component in many NLP tasks. However, the global geometry of the word embeddings remains poorly understood. In this paper, we demonstrate that a typical word embeddings cloud is shaped as a high-dimensional simplex with interpretable vertices and propose a simple yet effective method for enumeration of these vertices. We show that the proposed method can detect and describe vertices of the simplex for GloVe and fasttext spaces.

Tasks

Word Embeddings

Shape of Elephant: Study of Macro Properties of Word Embeddings Spaces

Abstract

Tasks

Reproductions