Shape of Elephant: Study of Macro Properties of Word Embeddings Spaces
2021-06-13Unverified0· sign in to hype
Alexey Tikhonov
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Pre-trained word representations became a key component in many NLP tasks. However, the global geometry of the word embeddings remains poorly understood. In this paper, we demonstrate that a typical word embeddings cloud is shaped as a high-dimensional simplex with interpretable vertices and propose a simple yet effective method for enumeration of these vertices. We show that the proposed method can detect and describe vertices of the simplex for GloVe and fasttext spaces.