Differentially Private Sparse Linear Regression with Heavy-tailed Responses

2025-06-07Unverified0· sign in to hype

Xizhi Tian, Meng Ding, Touming Tao, Zihang Xiang, Di Wang

Unverified — Be the first to reproduce this paper.

Abstract

As a fundamental problem in machine learning and differential privacy (DP), DP linear regression has been extensively studied. However, most existing methods focus primarily on either regular data distributions or low-dimensional cases with irregular data. To address these limitations, this paper provides a comprehensive study of DP sparse linear regression with heavy-tailed responses in high-dimensional settings. In the first part, we introduce the DP-IHT-H method, which leverages the Huber loss and private iterative hard thresholding to achieve an estimation error bound of \( O ( s^* 1 2 ( dn )^ 1 + + s^* 1 + 2 2 + 2 ( ^2 dn )^ 1 + ) \) under the (, )-DP model, where n is the sample size, d is the dimensionality, s^* is the sparsity of the parameter, and (0, 1] characterizes the tail heaviness of the data. In the second part, we propose DP-IHT-L, which further improves the error bound under additional assumptions on the response and achieves \( O ((s^*)^3/2 dn ). \) Compared to the first result, this bound is independent of the tail parameter . Finally, through experiments on synthetic and real-world datasets, we demonstrate that our methods outperform standard DP algorithms designed for ``regular'' data.

Tasks

regression

Differentially Private Sparse Linear Regression with Heavy-tailed Responses

Abstract

Tasks

Reproductions