SOTAVerified

Diversify and Disambiguate: Learning From Underspecified Data

2022-02-07Code Available1· sign in to hype

Yoonho Lee, Huaxiu Yao, Chelsea Finn

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Many datasets are underspecified: there exist multiple equally viable solutions to a given task. Underspecification can be problematic for methods that learn a single hypothesis because different functions that achieve low training loss can focus on different predictive features and thus produce widely varying predictions on out-of-distribution data. We propose DivDis, a simple two-stage framework that first learns a diverse collection of hypotheses for a task by leveraging unlabeled data from the test distribution. We then disambiguate by selecting one of the discovered hypotheses using minimal additional supervision, in the form of additional labels or inspection of function visualization. We demonstrate the ability of DivDis to find hypotheses that use robust features in image classification and natural language processing problems with underspecification.

Tasks

Reproductions