SOTAVerified

Shape of Elephant: Study of Macro Properties of Word Embeddings Spaces

2021-06-13Unverified0· sign in to hype

Alexey Tikhonov

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Pre-trained word representations became a key component in many NLP tasks. However, the global geometry of the word embeddings remains poorly understood. In this paper, we demonstrate that a typical word embeddings cloud is shaped as a high-dimensional simplex with interpretable vertices and propose a simple yet effective method for enumeration of these vertices. We show that the proposed method can detect and describe vertices of the simplex for GloVe and fasttext spaces.

Tasks

Reproductions