Self-Supervised Learning

Self-Supervised Learning is proposed for utilizing unlabeled data with the success of supervised learning. Producing a dataset with good labels is expensive, while unlabeled data is being generated all the time. The motivation of Self-Supervised Learning is to make use of the large amount of unlabeled data. The main idea of Self-Supervised Learning is to generate the labels from unlabeled data, according to the structure or characteristics of the data itself, and then train on this unsupervised data in a supervised manner. Self-Supervised Learning is wildly used in representation learning to make a model learn the latent features of the data. This technique is often employed in computer vision, video processing and robot control.

Source: Self-supervised Point Set Local Descriptors for Point Cloud Registration

Image source: LeCun

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 5044 papers

Title	Date	Tasks	Status	Hype
Metis: A Foundation Speech Generation Model with Masked Generative Pre-training	Feb 5, 2025	Self-Supervised LearningSpeech Enhancement	CodeCode Available	9
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer	Sep 1, 2024	Self-Supervised Learningtext-to-speech	CodeCode Available	9
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning	Jun 11, 2025	Action AnticipationLarge Language Model	CodeCode Available	7
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis	May 14, 2025	DenoisingDepth Estimation	CodeCode Available	7
What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders	May 20, 2022	Contrastive LearningLink Prediction	CodeCode Available	6
Transformers without Normalization	Mar 13, 2025	Self-Supervised Learning	CodeCode Available	5
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think	Oct 9, 2024	DenoisingImage Generation	CodeCode Available	5
Learning to (Learn at Test Time): RNNs with Expressive Hidden States	Jul 5, 2024	16k8k	CodeCode Available	5
Awesome Multi-modal Object Tracking	May 23, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	5
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding	May 6, 2024	Metric LearningSelf-Supervised Learning	CodeCode Available	5
Know Your Self-supervised Learning: A Survey on Image-based Generative and Discriminative Training	May 23, 2023	Contrastive LearningSelf-Supervised Learning	CodeCode Available	5
GigaAM: Efficient Self-Supervised Learner for Speech Recognition	Jun 1, 2025	Automatic Speech RecognitionLanguage Modeling	CodeCode Available	4
Sonata: Self-Supervised Learning of Reliable Point Representations	Mar 20, 2025	3D Semantic SegmentationSelf-Supervised Learning	CodeCode Available	4
Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise	Dec 5, 2024	DenoisingImage Restoration	CodeCode Available	4
Multimodal Whole Slide Foundation Model for Pathology	Nov 29, 2024	Cross-Modal Retrievalmodel	CodeCode Available	4
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch	Oct 27, 2023	Self-Supervised LearningSpeech Enhancement	CodeCode Available	4
SSL4EO-L: Datasets and Foundation Models for Landsat Imagery	Jun 15, 2023	Cloud DetectionEarth Observation	CodeCode Available	4
A Survey on Large Language Models for Recommendation	May 31, 2023	Recommendation Systems	CodeCode Available	4
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN	May 27, 2022	Image ClassificationInstance Segmentation	CodeCode Available	4
A Framework For Contrastive Self-Supervised Learning And Designing A New Approach	Aug 31, 2020	Data AugmentationImage Classification	CodeCode Available	4
Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models	Jun 23, 2025	Domain AdaptationGPU	CodeCode Available	3
Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D	Apr 19, 2025	DecoderObject Localization	CodeCode Available	3
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining	Mar 23, 2025	3DGSBenchmarking	CodeCode Available	3
MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization	Jan 2, 2025	Contrastive LearningKey Detection	CodeCode Available	3
STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes	Dec 31, 2024	Dynamic ReconstructionScene Flow Estimation	CodeCode Available	3

Show:10 25 50

← PrevPage 1 of 202Next →

All datasets DABS STL-10 CIFAR10 cifar100 ImageNet-100 (TEMI Split)TinyImageNet CIFAR-10 CIFAR-100 CREMA-D Tiny ImageNet

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Pretraining: None	Images & Text	57.5	—	Unverified
2	Pretraining: ShED	Images & Text	54.3	—	Unverified
3	Pretraining: e-Mix	Images & Text	48.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	Accuracy	91.7	—	Unverified
2	ResNet18	Accuracy	91.02	—	Unverified
3	MV-MR	Accuracy	89.67	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	average top-1 classification accuracy	93.89	—	Unverified
2	ResNet18	average top-1 classification accuracy	92.58	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	average top-1 classification accuracy	72.51	—	Unverified
2	ResNet18	average top-1 classification accuracy	69.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet50)	Top-1 Accuracy	82.64	—	Unverified
2	CorInfomax (ResNet18)	Top-1 Accuracy	80.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	average top-1 classification accuracy	51.84	—	Unverified
2	ResNet18	average top-1 classification accuracy	51.67	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet18)	Top-1 Accuracy	93.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet18)	Top-1 Accuracy	71.61	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Hybrid BYOL-S/CvT	Accuracy	67.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet50)	Top-1 Accuracy	54.86	—	Unverified