Unsupervised Label Noise Modeling and Loss Correction
Eric Arazo, Diego Ortego, Paul Albert, Noel E. O'Connor, Kevin McGuinness
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/PaulAlbert31/LabelNoiseCorrectionpytorch★ 225
- github.com/maxwell0027/pefatpytorch★ 53
Abstract
Despite being robust to small amounts of label noise, convolutional neural networks trained with stochastic gradient methods have been shown to easily fit random labels. When there are a mixture of correct and mislabelled targets, networks tend to fit the former before the latter. This suggests using a suitable two-component mixture model as an unsupervised generative model of sample loss values during training to allow online estimation of the probability that a sample is mislabelled. Specifically, we propose a beta mixture to estimate this probability and correct the loss by relying on the network prediction (the so-called bootstrapping loss). We further adapt mixup augmentation to drive our approach a step further. Experiments on CIFAR-10/100 and TinyImageNet demonstrate a robustness to label noise that substantially outperforms recent state-of-the-art. Source code is available at https://git.io/fjsvE
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Clothing1M | DY | Accuracy | 71 | — | Unverified |