Global Mixup: Eliminating Ambiguity with Clustering Relationships
Anonymous
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Data augmentation with Mixup has been proven an effective method to regularize the current deep neural networks. Mixup generates virtual samples and corresponding labels by linear interpolation. However, the linear interpolation needed for mixup causes two problems: (1) Only relying on linear relationships will generate ambiguous samples, confusing the model in the training phase. (2) The linear combination greatly limits the distribution space of the generated samples. To tackle these problems, We propose a simple but effective augmentation method based on global relationships called Global Mixup, which assigns credible and unambiguous labels to generated samples depending on the global relationship between virtual and actual samples. Extensive experiments using CNN, LSTM, and BERT on five tasks show that Global Mixup significantly outperforms previous state-of-the-art baselines. Further experiments also demonstrate the advantage of Global Mixup in low-resource scenarios.