Statistically and Computationally Efficient Linear Meta-representation Learning

2021-12-01NeurIPS 2021Unverified0· sign in to hype

Kiran K. Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh

Unverified — Be the first to reproduce this paper.

Abstract

In typical few-shot learning, each task is not equipped with enough data to be learned in isolation. To cope with such data scarcity, meta-representation learning methods train across many related tasks to find a shared (lower-dimensional) representation of the data where all tasks can be solved accurately. It is hypothesized that any new arriving tasks can be rapidly trained on this low-dimensional representation using only a few samples. Despite the practical successes of this approach, its statistical and computational properties are less understood. Moreover, the prescribed algorithms in these studies have little resemblance to those used in practice or they are computationally intractable. To understand and explain the success of popular meta-representation learning approaches such as ANIL, MetaOptNet, R2D2, and OML, we study a alternating gradient-descent minimization (AltMinGD) method (and its variant alternating minimization (AltMin)) which underlies the aforementioned methods. For a simple but canonical setting of shared linear representations, we show that AltMinGD achieves nearly-optimal estimation error, requiring only (polylog\,d) samples per task. This agrees with the observed efficacy of this algorithm in the practical few-shot learning scenarios.

Tasks

Few-Shot Learning Representation Learning

Statistically and Computationally Efficient Linear Meta-representation Learning

Abstract

Tasks

Reproductions