SOTAVerified

Let Go of Your Labels with Unsupervised Transfer

2024-06-11International Conference on Machine Learning 2024Code Available2· sign in to hype

Artyom Gadetsky, Yulun Jiang, Maria Brbic

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Foundation vision-language models have enabled remarkable zero-shot transferability of the pre-trained representations to a wide range of downstream tasks. However, to solve a new task, zero-shot transfer still necessitates human guidance to define visual categories that appear in the data. Here, we show that fully unsupervised transfer emerges when searching for the labeling of a dataset that induces maximal margin classifiers in representation spaces of different foundation models. We present TURTLE, a fully unsupervised method that effectively employs this guiding principle to uncover the underlying labeling of a downstream dataset without any supervision and task-specific representation learning. We evaluate TURTLE on a diverse benchmark suite of 26 datasets and show that it achieves new state-of-the-art unsupervised performance. Furthermore, TURTLE, although being fully unsupervised, outperforms zero-shot transfer baselines on a wide range of datasets. In particular, TURTLE matches the average performance of CLIP zero-shot on 26 datasets by employing the same representation space, spanning a wide range of architectures and model sizes. By guiding the search for the underlying labeling using the representation spaces of two foundation models, TURTLE surpasses zero-shot transfer and unsupervised prompt tuning baselines, demonstrating the surprising power and effectiveness of unsupervised transfer.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
BirdsnapTURTLE (CLIP + DINOv2)Accuracy68.1Unverified
Caltech-101TURTLE (CLIP + DINOv2)Accuracy89.8Unverified
CIFAR-10TURTLE (CLIP + DINOv2)Accuracy1Unverified
CIFAR-100TURTLE (CLIP + DINOv2)Accuracy0.9Unverified
CLEVR CountsTURTLE (CLIP + DINOv2)Accuracy24Unverified
Country211TURTLE (CLIP + DINOv2)Accuracy11.1Unverified
DTDTURTLE (CLIP + DINOv2)Accuracy57.3Unverified
EuroSATTURTLE (CLIP + DINOv2)Accuracy96.6Unverified
FER2013TURTLE (CLIP + DINOv2)Accuracy36.2Unverified
FGVC-AircraftTURTLE (CLIP + DINOv2)Accuracy36.5Unverified
Flowers-102TURTLE (CLIP + DINOv2)Accuracy99.6Unverified
Food-101TURTLE (CLIP + DINOv2)Accuracy92.2Unverified
GTSRBTURTLE (CLIP + DINOv2)Accuracy48.4Unverified
Hateful MemesTURTLE (CLIP + DINOv2)Accuracy54.2Unverified
ImageNetTURTLE (CLIP + DINOv2)Accuracy72.9Unverified
Kinetics-700TURTLE (CLIP + DINOv2)Accuracy43Unverified
KITTITURTLE (CLIP + DINOv2)Accuracy39.4Unverified
MNISTTURTLE (CLIP + DINOv2)Accuracy97.8Unverified
Oxford-IIIT PetsTURTLE (CLIP + DINOv2)Accuracy92.3Unverified
PCamTURTLE (CLIP + DINOv2)Accuracy52Unverified
Rendered SST2TURTLE (CLIP + DINOv2)Accuracy51.6Unverified
RESISC45TURTLE (CLIP + DINOv2)Accuracy89.6Unverified
Stanford CarsTURTLE (CLIP + DINOv2)Accuracy0.65Unverified
STL-10TURTLE (CLIP + DINOv2)Accuracy1Unverified
SUN397TURTLE (CLIP + DINOv2)Accuracy67.9Unverified
UCF101TURTLE (CLIP + DINOv2)Accuracy82.3Unverified

Reproductions