Minimax Optimal Algorithms with Fixed-k-Nearest Neighbors
J. Jon Ryu, Young-Han Kim
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/jongharyu/split-knn-rulesOfficialIn papernone★ 0
Abstract
This paper presents how to perform minimax optimal classification, regression, and density estimation based on fixed-k nearest neighbor (NN) searches. We consider a distributed learning scenario, in which a massive dataset is split into smaller groups, where the k-NNs are found for a query point with respect to each subset of data. We propose optimal rules to aggregate the fixed-k-NN information for classification, regression, and density estimation that achieve minimax optimal rates for the respective problems. We show that the distributed algorithm with a fixed k over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor under some regularity conditions. Roughly speaking, distributed k-NN rules with M groups has a performance comparable to the standard (kM)-NN rules even for fixed k.