SOTAVerified

Adversarial Attack

An Adversarial Attack is a technique to find a perturbation that changes the prediction of a machine learning model. The perturbation can be very small and imperceptible to human eyes.

Source: Recurrent Attention Model with Log-Polar Mapping is Robust against Adversarial Attacks

Papers

Showing 101125 of 1808 papers

TitleStatusHype
Towards Robust and Secure Embodied AI: A Survey on Vulnerabilities and Attacks0
PAR-AdvGAN: Improving Adversarial Attack Capability with Progressive Auto-Regression AdvGAN0
To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning ModelsCode1
ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech0
AdvSwap: Covert Adversarial Perturbation with High Frequency Info-swapping for Autonomous Driving Perception0
MAA: Meticulous Adversarial Attack against Vision-Language Pre-trained Models0
Universal Adversarial Attack on Aligned Multimodal LLMs0
Democratic Training Against Universal Adversarial Perturbations0
Rigid Body Adversarial Attacks0
BitAbuse: A Dataset of Visually Perturbed Texts for Defending Phishing AttacksCode0
Real-Time Privacy Risk Measurement with Privacy Tokens for Gradient Leakage0
Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement LearningCode0
MARAGE: Transferable Multi-Model Adversarial Attack for Retrieval-Augmented Generation Data Extraction0
CoRPA: Adversarial Image Generation for Chest X-rays Using Concept Vector Perturbations and Generative Models0
FRAUD-RLA: A new reinforcement learning adversarial attack against credit card fraud detection0
Refining Adaptive Zeroth-Order Optimization at Ease0
Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings0
Redefining Machine Unlearning: A Conformal Prediction-Motivated Approach0
Understanding Oversmoothing in GNNs as Consensus in Opinion Dynamics0
SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse AutoencodersCode2
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate CampaignsCode1
The Relationship Between Network Similarity and Transferability of Adversarial Attacks0
GreedyPixel: Fine-Grained Black-Box Adversarial Attack Via Greedy Algorithm0
Device-aware Optical Adversarial Attack for a Portable Projector-camera System0
Black-Box Adversarial Attack on Vision Language Models for Autonomous Driving0
Show:102550
← PrevPage 5 of 73Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Xu et al.Attack: PGD2078.68Unverified
23-ensemble of multi-resolution self-ensemblesAttack: AutoAttack78.13Unverified
3TRADES-ANCRA/ResNet18Attack: AutoAttack59.7Unverified
4AdvTraining [madry2018]Attack: PGD2048.44Unverified
5TRADES [zhang2019b]Attack: PGD2045.9Unverified
6XU-NetRobust Accuracy1Unverified
#ModelMetricClaimedVerifiedStatus
13-ensemble of multi-resolution self-ensemblesAttack: AutoAttack51.28Unverified
2multi-resolution self-ensemblesAttack: AutoAttack47.85Unverified