Reconstructing Word Embeddings via Scattered k-Sub-Embedding
Soonyong Hwang, Byung-Ro Moon
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
The performance of modern neural language models relies heavily on the diversity of the vocabularies. Unfortunately, the language models tend to cover more vocabularies, the embedding parameters in the language models such as multilingual models used to occupy more than a half of their entire learning parameters. To solve this problem, we aim to devise a novel embedding structure to lighten the network without considerably performance degradation. To reconstruct N embedding vectors, we initialize k bundles of M ( N) k-sub-embeddings to apply Cartesian product. Furthermore, we assign k-sub-embedding using the contextual relationship between tokens from pretrained language models. We adjust our k-sub-embedding structure to masked language models to evaluate proposed structure on downstream tasks. Our experimental results show that over 99.9\%+ compressed sub-embeddings for the language models performed comparably with the original embedding structure on GLUE and XNLI benchmarks.