On Characterizing and Mitigating Imbalances in Multi-Instance Partial Label Learning

2024-07-13Unverified0· sign in to hype

Kaifu Wang, Efthymia Tsamoura, Dan Roth

Unverified — Be the first to reproduce this paper.

Abstract

*Multi-Instance Partial Label Learning* (MI-PLL) is a weakly-supervised learning setting encompassing *partial label learning*, *latent structural learning*, and *neurosymbolic learning*. Unlike supervised learning, in MI-PLL, the inputs to the classifiers at training-time are tuples of instances x. At the same time, the supervision signal is generated by a function over the (hidden) gold labels of x. In this work, we make multiple contributions towards addressing a problem that hasn't been studied so far in the context of MI-PLL: that of characterizing and mitigating *learning imbalances*, i.e., major differences in the errors occurring when classifying instances of different classes (aka *class-specific risks*). In terms of theory, we derive class-specific risk bounds for MI-PLL, while making minimal assumptions. Our theory reveals a unique phenomenon: that can greatly impact learning imbalances. This result is in sharp contrast with previous research on supervised and weakly-supervised learning, which only studies learning imbalances under the prism of data imbalances. On the practical side, we introduce a technique for estimating the marginal of the hidden labels using only MI-PLL data. Then, we introduce algorithms that mitigate imbalances at training- and testing-time, by treating the marginal of the hidden labels as a constraint. We demonstrate the effectiveness of our techniques using strong baselines from neurosymbolic and long-tail learning, suggesting performance improvements of up to 14\%.

Tasks

Long-tail Learning Partial Label Learning Weakly-supervised Learning

On Characterizing and Mitigating Imbalances in Multi-Instance Partial Label Learning

Abstract

Tasks

Reproductions