The sample complexity of multi-distribution learning

2023-12-07Unverified0· sign in to hype

Binghui Peng

Unverified — Be the first to reproduce this paper.

Abstract

Multi-distribution learning generalizes the classic PAC learning to handle data coming from multiple distributions. Given a set of k data distributions and a hypothesis class of VC dimension d, the goal is to learn a hypothesis that minimizes the maximum population loss over k distributions, up to additive error. In this paper, we settle the sample complexity of multi-distribution learning by giving an algorithm of sample complexity O((d+k)^-2) (k/)^o(1). This matches the lower bound up to sub-polynomial factor and resolves the COLT 2023 open problem of Awasthi, Haghtalab and Zhao [AHZ23].

Tasks

PAC learning

The sample complexity of multi-distribution learning

Abstract

Tasks

Reproductions