Global Mixup: Eliminating Ambiguity with Clustering Relationships

2021-11-16ACL ARR November 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

Data augmentation with Mixup has been proven an effective method to regularize the current deep neural networks. Mixup generates virtual samples and corresponding labels by linear interpolation. However, the linear interpolation needed for mixup causes two problems: (1) Only relying on linear relationships will generate ambiguous samples, confusing the model in the training phase. (2) The linear combination greatly limits the distribution space of the generated samples. To tackle these problems, We propose a simple but effective augmentation method based on global relationships called Global Mixup, which assigns credible and unambiguous labels to generated samples depending on the global relationship between virtual and actual samples. Extensive experiments using CNN, LSTM, and BERT on five tasks show that Global Mixup significantly outperforms previous state-of-the-art baselines. Further experiments also demonstrate the advantage of Global Mixup in low-resource scenarios.

Tasks

Clustering Data Augmentation

Global Mixup: Eliminating Ambiguity with Clustering Relationships

Abstract

Tasks

Reproductions