Effective Dimensionality Reduction for Word Embeddings

2019-08-01WS 2019Code Available0· sign in to hype

Vikas Raunak, Vivek Gupta, Florian Metze

Code Available — Be the first to reproduce this paper.

Code

github.com/vyraun/Half-Size
OfficialIn papernone★ 0

Abstract

Pre-trained word embeddings are used in several downstream applications as well as for constructing representations for sentences, paragraphs and documents. Recently, there has been an emphasis on improving the pretrained word vectors through post-processing algorithms. One improvement area is reducing the dimensionality of word embeddings. Reducing the size of word embeddings can improve their utility in memory constrained devices, benefiting several real world applications. In this work, we present a novel technique that efficiently combines PCA based dimensionality reduction with a recently proposed post-processing algorithm (Mu and Viswanath, 2018), to construct effective word embeddings of lower dimensions. Empirical evaluations on several benchmarks show that our algorithm efficiently reduces the embedding size while achieving similar or (more often) better performance than original embeddings. We have released the source code along with this paper.

Tasks

Dimensionality Reduction Word Embeddings

Effective Dimensionality Reduction for Word Embeddings

Code

Abstract

Tasks

Reproductions