On the existence of consistent adversarial attacks in high-dimensional linear classification

2025-06-14Unverified0· sign in to hype

Matteo Vilucchio, Lenka Zdeborová, Bruno Loureiro

Unverified — Be the first to reproduce this paper.

Abstract

What fundamentally distinguishes an adversarial attack from a misclassification due to limited model expressivity or finite data? In this work, we investigate this question in the setting of high-dimensional binary classification, where statistical effects due to limited data availability play a central role. We introduce a new error metric that precisely capture this distinction, quantifying model vulnerability to consistent adversarial attacks -- perturbations that preserve the ground-truth labels. Our main technical contribution is an exact and rigorous asymptotic characterization of these metrics in both well-specified models and latent space models, revealing different vulnerability patterns compared to standard robust error measures. The theoretical results demonstrate that as models become more overparameterized, their vulnerability to label-preserving perturbations grows, offering theoretical insight into the mechanisms underlying model sensitivity to adversarial attacks.

Tasks

Adversarial Attack Binary Classification

On the existence of consistent adversarial attacks in high-dimensional linear classification

Abstract

Tasks

Reproductions