AutoCleansing: Unbiased Estimation of Deep Learning with Mislabeled Data
Koichi Kuriyama
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Mislabeled samples cause prediction errors. This study proposes a solution to the problem of incorrect labels, called AutoCleansing, to automatically capture the effect of incorrect labels and mitigate it without removing the mislabeled samples. AutoCleansing consists of a base network model and sample-category specific constants. Both parameters of the base model and sample-category constants are estimated simultaneously using the training data. Thereafter, predictions for test data are made using a base model without the constants capturing the mislabeled effects. A theoretical model for AutoCleansing is developed and theoretical analysis shows that the proposed method can estimate true parameters with mislabeled data if the model is correctly constructed. Experimental results show that AutoCleansing has better performance in test accuracy than previous studies for CIFAR-10, CIFAR-100, SVHN, and ImageNet datasets.