Parameter Free Hierarchical Graph-Based Clustering for Analyzing Continuous Word Embeddings
2017-08-01WS 2017Unverified0· sign in to hype
Thomas Alex Trost, er, Dietrich Klakow
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Word embeddings are high-dimensional vector representations of words and are thus difficult to interpret. In order to deal with this, we introduce an unsupervised parameter free method for creating a hierarchical graphical clustering of the full ensemble of word vectors and show that this structure is a geometrically meaningful representation of the original relations between the words. This newly obtained representation can be used for better understanding and thus improving the embedding algorithm and exhibits semantic meaning, so it can also be utilized in a variety of language processing tasks like categorization or measuring similarity.