SOTAVerified

AnomalyDINO: Boosting Patch-based Few-shot Anomaly Detection with DINOv2

2024-05-23Code Available2· sign in to hype

Simon Damm, Mike Laszkiewicz, Johannes Lederer, Asja Fischer

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Recent advances in multimodal foundation models have set new standards in few-shot anomaly detection. This paper explores whether high-quality visual features alone are sufficient to rival existing state-of-the-art vision-language models. We affirm this by adapting DINOv2 for one-shot and few-shot anomaly detection, with a focus on industrial applications. We show that this approach does not only rival existing techniques but can even outmatch them in many settings. Our proposed vision-only approach, AnomalyDINO, is based on patch similarities and enables both image-level anomaly prediction and pixel-level anomaly segmentation. The approach is methodologically simple and training-free and, thus, does not require any additional data for fine-tuning or meta-learning. Despite its simplicity, AnomalyDINO achieves state-of-the-art results in one- and few-shot anomaly detection (e.g., pushing the one-shot performance on MVTec-AD from an AUROC of 93.1% to 96.6%). The reduced overhead, coupled with its outstanding few-shot performance, makes AnomalyDINO a strong candidate for fast deployment, e.g., in industrial contexts.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
MVTec ADAnomalyDINO-S (full-shot)Detection AUROC99.5Unverified
MVTec ADAnomalyDINO-S (4-shot)Detection AUROC97.7Unverified
MVTec ADAnomalyDINO-S (2-shot)Detection AUROC96.9Unverified
MVTec ADAnomalyDINO-S (1-shot)Detection AUROC96.6Unverified
VisAAnomalyDINO-S (full-shot)Detection AUROC97.6Unverified
VisAAnomalyDINO-S (4-shot)Detection AUROC92.6Unverified
VisAAnomalyDINO-S (2-shot)Detection AUROC89.7Unverified
VisAAnomalyDINO-S (1-shot)Detection AUROC87.4Unverified

Reproductions