Self-Supervised Learning

Self-Supervised Learning is proposed for utilizing unlabeled data with the success of supervised learning. Producing a dataset with good labels is expensive, while unlabeled data is being generated all the time. The motivation of Self-Supervised Learning is to make use of the large amount of unlabeled data. The main idea of Self-Supervised Learning is to generate the labels from unlabeled data, according to the structure or characteristics of the data itself, and then train on this unsupervised data in a supervised manner. Self-Supervised Learning is wildly used in representation learning to make a model learn the latent features of the data. This technique is often employed in computer vision, video processing and robot control.

Source: Self-supervised Point Set Local Descriptors for Point Cloud Registration

Image source: LeCun

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2701–2725 of 5044 papers

Title	Date	Tasks	Status
Video as the New Language for Real-World Decision Making	Feb 27, 2024	Decision MakingIn-Context Learning	—Unverified
Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound	Aug 21, 2024	Audio GenerationAudio Synthesis	—Unverified
Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition	Aug 22, 2018	Action RecognitionActivity Recognition	—Unverified
Video Representation Learning by Recognizing Temporal Transformations	Jul 21, 2020	Action RecognitionRepresentation Learning	—Unverified
Video Transformers: A Survey	Jan 16, 2022	Action ClassificationSelf-Supervised Learning	—Unverified
Video Understanding as Machine Translation	Jun 12, 2020	Machine TranslationMetric Learning	—Unverified
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding	Mar 24, 2025	8kGPU	—Unverified
VieSum: How Robust Are Transformer-based Models on Vietnamese Summarization?	Oct 8, 2021	Abstractive Text SummarizationDecoder	—Unverified
VietASR: Achieving Industry-level Vietnamese ASR with 50-hour labeled data and Large-Scale Speech Pretraining	May 23, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
ViewMix: Augmentation for Robust Representation in Self-Supervised Learning	Sep 6, 2023	Representation LearningSelf-Supervised Learning	—Unverified
ViewNet: Unsupervised Viewpoint Estimation from Conditional Generation	Dec 1, 2022	Image ReconstructionSelf-Supervised Learning	—Unverified
Views Can Be Deceiving: Improved SSL Through Feature Space Augmentation	May 28, 2024	Representation LearningSelf-Supervised Learning	—Unverified
VIGraph: Generative Self-supervised Learning for Class-Imbalanced Node Classification	Nov 2, 2023	Contrastive LearningNode Classification	—Unverified
Vi-MIX FOR SELF-SUPERVISED VIDEO REPRESENTATION	Sep 29, 2021	Action RecognitionRepresentation Learning	—Unverified
Virtual Node Generation for Node Classification in Sparsely-Labeled Graphs	Sep 12, 2024	Graph LearningMeta-Learning	—Unverified
Visible and infrared self-supervised fusion trained on a single example	Jul 9, 2023	object-detectionObject Detection	—Unverified
Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft	May 9, 2024	AllLanguage Modeling	—Unverified
Vision Learners Meet Web Image-Text Pairs	Jan 17, 2023	BenchmarkingSelf-Supervised Learning	—Unverified
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision	Feb 16, 2022	Action ClassificationAction Recognition	—Unverified
Vision Transformers: State of the Art and Research Challenges	Jul 7, 2022	3D ReconstructionImage Segmentation	—Unverified
Visual Lexicon: Rich Image Features in Language Space	Dec 9, 2024	Image GenerationImage Reconstruction	—Unverified
Visually Guided Self Supervised Learning of Speech Representations	Jan 13, 2020	Emotion RecognitionRepresentation Learning	—Unverified
Visual Representation Learning with Stochastic Frame Prediction	Jun 11, 2024	DecoderPose Tracking	—Unverified
Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations	May 12, 2022	ObjectObject Localization	—Unverified
ViTAR: Vision Transformer with Any Resolution	Mar 27, 2024	Self-Supervised LearningSemantic Segmentation	—Unverified

Show:10 25 50

← PrevPage 109 of 202Next →

All datasets DABS STL-10 CIFAR10 cifar100 ImageNet-100 (TEMI Split)TinyImageNet CIFAR-10 CIFAR-100 CREMA-D Tiny ImageNet

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Pretraining: None	Images & Text	57.5	—	Unverified
2	Pretraining: ShED	Images & Text	54.3	—	Unverified
3	Pretraining: e-Mix	Images & Text	48.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	Accuracy	91.7	—	Unverified
2	ResNet18	Accuracy	91.02	—	Unverified
3	MV-MR	Accuracy	89.67	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	average top-1 classification accuracy	93.89	—	Unverified
2	ResNet18	average top-1 classification accuracy	92.58	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	average top-1 classification accuracy	72.51	—	Unverified
2	ResNet18	average top-1 classification accuracy	69.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet50)	Top-1 Accuracy	82.64	—	Unverified
2	CorInfomax (ResNet18)	Top-1 Accuracy	80.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	average top-1 classification accuracy	51.84	—	Unverified
2	ResNet18	average top-1 classification accuracy	51.67	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet18)	Top-1 Accuracy	93.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet18)	Top-1 Accuracy	71.61	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Hybrid BYOL-S/CvT	Accuracy	67.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet50)	Top-1 Accuracy	54.86	—	Unverified