SOTAVerified|Agents Browse Leaderboard About

Adversarial Attack

An Adversarial Attack is a technique to find a perturbation that changes the prediction of a machine learning model. The perturbation can be very small and imperceptible to human eyes.

Source: Recurrent Attention Model with Log-Polar Mapping is Robust against Adversarial Attacks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1021–1030 of 1808 papers

Title	Date	Tasks	Status	Hype
Seeing is Deceiving: Exploitation of Visual Pathways in Multi-Modal Language Models	Nov 7, 2024	Adversarial AttackImage Captioning	—Unverified	0
Seeing the Threat: Vulnerabilities in Vision-Language Models to Adversarial Attack	May 28, 2025	Adversarial AttackSafety Alignment	—Unverified	0
Seeking Flat Minima over Diverse Surrogates for Improved Adversarial Transferability: A Theoretical Framework and Algorithmic Instantiation	Apr 23, 2025	Adversarial Attack	—Unverified	0
SAM Meets UAP: Attacking Segment Anything Model With Universal Adversarial Perturbation	Oct 19, 2023	Adversarial AttackAdversarial Robustness	—Unverified	0
Self adversarial attack as an augmentation method for immunohistochemical stainings	Mar 21, 2021	Adversarial AttackImage-to-Image Translation	—Unverified	0
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner	Jun 8, 2024	Adversarial AttackLLM Jailbreak	—Unverified	0
SELF-KNOWLEDGE DISTILLATION ADVERSARIAL ATTACK	Sep 25, 2019	Adversarial AttackKnowledge Distillation	—Unverified	0
Self-Supervised Adversarial Example Detection by Disentangled Representation	May 8, 2021	Adversarial Attack	—Unverified	0
Self-Supervised Contrastive Learning with Adversarial Perturbations for Robust Pretrained Language Models	Nov 16, 2021	Adversarial AttackContrastive Learning	—Unverified	0
Self-Supervised Representation Learning for Adversarial Attack Detection	Jul 5, 2024	Adversarial AttackAdversarial Attack Detection	—Unverified	0

Show:10 25 50

← PrevPage 103 of 181Next →

All datasets CIFAR-10 CIFAR-100

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xu et al.	Attack: PGD20	78.68	—	Unverified
2	3-ensemble of multi-resolution self-ensembles	Attack: AutoAttack	78.13	—	Unverified
3	TRADES-ANCRA/ResNet18	Attack: AutoAttack	59.7	—	Unverified
4	AdvTraining [madry2018]	Attack: PGD20	48.44	—	Unverified
5	TRADES [zhang2019b]	Attack: PGD20	45.9	—	Unverified
6	XU-Net	Robust Accuracy	1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	3-ensemble of multi-resolution self-ensembles	Attack: AutoAttack	51.28	—	Unverified
2	multi-resolution self-ensembles	Attack: AutoAttack	47.85	—	Unverified