Let Go of Your Labels with Unsupervised Transfer

2024-06-11International Conference on Machine Learning 2024Code Available2· sign in to hype

Artyom Gadetsky, Yulun Jiang, Maria Brbic

Code Available — Be the first to reproduce this paper.

Code

github.com/mlbio-epfl/turtle
Officialpytorch★ 80

Abstract

Foundation vision-language models have enabled remarkable zero-shot transferability of the pre-trained representations to a wide range of downstream tasks. However, to solve a new task, zero-shot transfer still necessitates human guidance to define visual categories that appear in the data. Here, we show that fully unsupervised transfer emerges when searching for the labeling of a dataset that induces maximal margin classifiers in representation spaces of different foundation models. We present TURTLE, a fully unsupervised method that effectively employs this guiding principle to uncover the underlying labeling of a downstream dataset without any supervision and task-specific representation learning. We evaluate TURTLE on a diverse benchmark suite of 26 datasets and show that it achieves new state-of-the-art unsupervised performance. Furthermore, TURTLE, although being fully unsupervised, outperforms zero-shot transfer baselines on a wide range of datasets. In particular, TURTLE matches the average performance of CLIP zero-shot on 26 datasets by employing the same representation space, spanning a wide range of architectures and model sizes. By guiding the search for the underlying labeling using the representation spaces of two foundation models, TURTLE surpasses zero-shot transfer and unsupervised prompt tuning baselines, demonstrating the surprising power and effectiveness of unsupervised transfer.

Tasks

Image Clustering Unsupervised Image Classification

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Birdsnap	TURTLE (CLIP + DINOv2)	Accuracy	68.1	—	Unverified
Caltech-101	TURTLE (CLIP + DINOv2)	Accuracy	89.8	—	Unverified
CIFAR-10	TURTLE (CLIP + DINOv2)	Accuracy	1	—	Unverified
CIFAR-100	TURTLE (CLIP + DINOv2)	Accuracy	0.9	—	Unverified
CLEVR Counts	TURTLE (CLIP + DINOv2)	Accuracy	24	—	Unverified
Country211	TURTLE (CLIP + DINOv2)	Accuracy	11.1	—	Unverified
DTD	TURTLE (CLIP + DINOv2)	Accuracy	57.3	—	Unverified
EuroSAT	TURTLE (CLIP + DINOv2)	Accuracy	96.6	—	Unverified
FER2013	TURTLE (CLIP + DINOv2)	Accuracy	36.2	—	Unverified
FGVC-Aircraft	TURTLE (CLIP + DINOv2)	Accuracy	36.5	—	Unverified
Flowers-102	TURTLE (CLIP + DINOv2)	Accuracy	99.6	—	Unverified
Food-101	TURTLE (CLIP + DINOv2)	Accuracy	92.2	—	Unverified
GTSRB	TURTLE (CLIP + DINOv2)	Accuracy	48.4	—	Unverified
Hateful Memes	TURTLE (CLIP + DINOv2)	Accuracy	54.2	—	Unverified
ImageNet	TURTLE (CLIP + DINOv2)	Accuracy	72.9	—	Unverified
Kinetics-700	TURTLE (CLIP + DINOv2)	Accuracy	43	—	Unverified
KITTI	TURTLE (CLIP + DINOv2)	Accuracy	39.4	—	Unverified
MNIST	TURTLE (CLIP + DINOv2)	Accuracy	97.8	—	Unverified
Oxford-IIIT Pets	TURTLE (CLIP + DINOv2)	Accuracy	92.3	—	Unverified
PCam	TURTLE (CLIP + DINOv2)	Accuracy	52	—	Unverified
Rendered SST2	TURTLE (CLIP + DINOv2)	Accuracy	51.6	—	Unverified
RESISC45	TURTLE (CLIP + DINOv2)	Accuracy	89.6	—	Unverified
Stanford Cars	TURTLE (CLIP + DINOv2)	Accuracy	0.65	—	Unverified
STL-10	TURTLE (CLIP + DINOv2)	Accuracy	1	—	Unverified
SUN397	TURTLE (CLIP + DINOv2)	Accuracy	67.9	—	Unverified
UCF101	TURTLE (CLIP + DINOv2)	Accuracy	82.3	—	Unverified

Let Go of Your Labels with Unsupervised Transfer

Code

Abstract

Tasks

Benchmark Results

Reproductions