Principal Eigenvalue Regularization for Improved Worst-Class Certified Robustness of Smoothed Classifiers

2025-03-21Unverified0· sign in to hype

Gaojie Jin, Tianjin Huang, Ronghui Mu, Xiaowei Huang

Unverified — Be the first to reproduce this paper.

Abstract

Recent studies have identified a critical challenge in deep neural networks (DNNs) known as ``robust fairness", where models exhibit significant disparities in robust accuracy across different classes. While prior work has attempted to address this issue in adversarial robustness, the study of worst-class certified robustness for smoothed classifiers remains unexplored. Our work bridges this gap by developing a PAC-Bayesian bound for the worst-class error of smoothed classifiers. Through theoretical analysis, we demonstrate that the largest eigenvalue of the smoothed confusion matrix fundamentally influences the worst-class error of smoothed classifiers. Based on this insight, we introduce a regularization method that optimizes the largest eigenvalue of smoothed confusion matrix to enhance worst-class accuracy of the smoothed classifier and further improve its worst-class certified robustness. We provide extensive experimental validation across multiple datasets and model architectures to demonstrate the effectiveness of our approach.

Tasks

Adversarial Robustness Fairness

Principal Eigenvalue Regularization for Improved Worst-Class Certified Robustness of Smoothed Classifiers

Abstract

Tasks

Reproductions