Optimal Estimator for Linear Regression with Shuffled Labels

2023-10-02Unverified0· sign in to hype

Hang Zhang, Ping Li

Unverified — Be the first to reproduce this paper.

Abstract

This paper considers the task of linear regression with shuffled labels, i.e., Y = X B + W, where Y R^n m, Pi R^n n, X R^n p, B R^p m, and W R^n m, respectively, represent the sensing results, (unknown or missing) corresponding information, sensing matrix, signal of interest, and additive sensing noise. Given the observation Y and sensing matrix X, we propose a one-step estimator to reconstruct ( , B). From the computational perspective, our estimator's complexity is O(n^3 + np^2m), which is no greater than the maximum complexity of a linear assignment algorithm (e.g., O(n^3)) and a least square algorithm (e.g., O(np^2 m)). From the statistical perspective, we divide the minimum snr requirement into four regimes, e.g., unknown, hard, medium, and easy regimes; and present sufficient conditions for the correct permutation recovery under each regime: (i) snr (1) in the easy regime; (ii) snr ( n) in the medium regime; and (iii) snr (( n)^c_0 n^c_1/srank( B)) in the hard regime (c_0, c_1 are some positive constants and srank( B) denotes the stable rank of B). In the end, we also provide numerical experiments to confirm the above claims.

Tasks

regression

Optimal Estimator for Linear Regression with Shuffled Labels

Abstract

Tasks

Reproductions