Ruminating Word Representations with Random Noise Masking
Hwiyeol Jo, Byoung-Tak Zhang
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We introduce a training method for better word representation and performance, which we call GraVeR (Gradual Vector Rumination). The method is to gradually and iteratively add random noises and bias to word embeddings after training a model, and re-train the model from scratch but initialize with the noised word embeddings. Through the re-training process, some of noises can be compensated and other noises can be utilized to learn better representations. As a result, we can get word representations further fine-tuned and specialized in the task. On six text classification tasks, our method improves model performances with a large gap. When GraVeR is combined with other regularization techniques, it shows further improvements. Lastly, we investigate the usefulness of GraVeR for pretraining by training data.