Self-Supervised Learning

Self-Supervised Learning is proposed for utilizing unlabeled data with the success of supervised learning. Producing a dataset with good labels is expensive, while unlabeled data is being generated all the time. The motivation of Self-Supervised Learning is to make use of the large amount of unlabeled data. The main idea of Self-Supervised Learning is to generate the labels from unlabeled data, according to the structure or characteristics of the data itself, and then train on this unsupervised data in a supervised manner. Self-Supervised Learning is wildly used in representation learning to make a model learn the latent features of the data. This technique is often employed in computer vision, video processing and robot control.

Source: Self-supervised Point Set Local Descriptors for Point Cloud Registration

Image source: LeCun

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 5044 papers

Title	Date	Tasks	Status	Hype	Score
An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios	Jun 13, 2024	Language IdentificationSelf-Supervised Learning	CodeCode Available	2	5
Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram	Feb 2, 2024	DiagnosticECG Classification	CodeCode Available	2	5
HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition	Jan 11, 2024	Contrastive LearningDynamic Facial Expression Recognition	CodeCode Available	2	5
DiffMM: Multi-Modal Diffusion Model for Recommendation	Jun 17, 2024	Contrastive Learningmodel	CodeCode Available	2	5
DM-Codec: Distilling Multimodal Representations for Speech Tokenization	Oct 19, 2024	Self-Supervised LearningSpeech Tokenization	CodeCode Available	2	5
Imagine Before Go: Self-Supervised Generative Map for Object Goal Navigation	Jan 1, 2024	General KnowledgeNavigate	CodeCode Available	2	5
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers	Apr 20, 2022	DisentanglementSelf-Supervised Learning	CodeCode Available	2	5
InfMAE: A Foundation Model in the Infrared Modality	Feb 1, 2024	DecoderSelf-Supervised Learning	CodeCode Available	2	5
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks	Oct 30, 2023	Benchmarkingobject-detection	CodeCode Available	2	5
Multistain Pretraining for Slide Representation Learning in Pathology	Aug 5, 2024	Representation LearningSelf-Supervised Learning	CodeCode Available	2	5
LLMs as Zero-shot Graph Learners: Alignment of GNN Representations with LLM Token Embeddings	Aug 25, 2024	Language ModellingLink Prediction	CodeCode Available	2	5
DGFont++: Robust Deformable Generative Networks for Unsupervised Font Generation	Dec 30, 2022	Font GenerationImage-to-Image Translation	CodeCode Available	2	5
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment	Jan 16, 2024	DisentanglementSelf-Supervised Learning	CodeCode Available	2	5
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond	Dec 31, 2023	Representation LearningSelf-Supervised Learning	CodeCode Available	2	5
Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote Sensing	Jan 29, 2024	GPURepresentation Learning	CodeCode Available	2	5
A Simple Framework for Contrastive Learning of Visual Representations	Feb 13, 2020	Contrastive LearningImage Classification	CodeCode Available	2	5
CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding	Mar 1, 2022	3D Object Classification3D Point Cloud Linear Classification	CodeCode Available	2	5
Deconstructing Denoising Diffusion Models for Self-Supervised Learning	Jan 25, 2024	DenoisingImage Generation	CodeCode Available	2	5
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations	Sep 26, 2019	Common Sense ReasoningGPU	CodeCode Available	2	5
Mono-ViFI: A Unified Learning Framework for Self-supervised Single- and Multi-frame Monocular Depth Estimation	Jul 19, 2024	Data AugmentationDepth Estimation	CodeCode Available	2	5
Decoupled-and-Coupled Networks: Self-Supervised Hyperspectral Image Super-Resolution with Subpixel Fusion	May 7, 2022	Hyperspectral Image Super-ResolutionImage Super-Resolution	CodeCode Available	2	5
Context Autoencoder for Self-Supervised Representation Learning	Feb 7, 2022	DecoderInstance Segmentation	CodeCode Available	2	5
Multiview Compressive Coding for 3D Reconstruction	Jan 19, 2023	3D ReconstructionDecoder	CodeCode Available	2	5
CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model	Mar 3, 2020	8kLanguage Modeling	CodeCode Available	2	5
Contrastive Audio-Visual Masked Autoencoder	Oct 2, 2022	Audio ClassificationAudio Tagging	CodeCode Available	2	5
PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders	Aug 16, 2024	3D Object Classification3D Point Cloud Classification	CodeCode Available	2	5
PhilEO Bench: Evaluating Geo-Spatial Foundation Models	Jan 9, 2024	Density EstimationEarth Observation	CodeCode Available	2	5
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training	May 28, 2022	3D Object Detection3D Point Cloud Classification	CodeCode Available	2	5
A Comprehensive Survey on Self-Supervised Learning for Recommendation	Apr 4, 2024	Contrastive LearningRecommendation Systems	CodeCode Available	2	5
A Foundation Model for Music Informatics	Nov 6, 2023	Information Retrievalmodel	CodeCode Available	2	5
A Versatile Framework for Multi-scene Person Re-identification	Mar 17, 2024	Data AugmentationPerson Re-Identification	CodeCode Available	2	5
Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting	Jan 2, 2023	3D Object DetectionMotion Forecasting	CodeCode Available	2	5
Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation	Aug 5, 2024	RhythmSelf-Supervised Learning	CodeCode Available	2	5
A generalizable 3D framework and model for self-supervised learning in medical imaging	Jan 20, 2025	Medical Image SegmentationSelf-Supervised Learning	CodeCode Available	2	5
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation	May 27, 2022	Contrastive Learningimage-classification	CodeCode Available	2	5
A Multimodal Vision Foundation Model for Clinical Dermatology	Oct 19, 2024	DiagnosticLesion Segmentation	CodeCode Available	2	5
DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks	Sep 10, 2024	Contrastive LearningImage Reconstruction	CodeCode Available	2	5
EMO-SUPERB: An In-depth Look at Speech Emotion Recognition	Feb 20, 2024	Emotion RecognitionSelf-Supervised Learning	CodeCode Available	2	5
Scaling up self-supervised learning for improved surgical foundation models	Jan 16, 2025	Self-Supervised LearningSemantic Segmentation	CodeCode Available	2	5
Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning	Jun 6, 2022	Self-Supervised LearningSurvival Prediction	CodeCode Available	2	5
A Survey on Mixup Augmentations and Beyond	Sep 8, 2024	Image ClassificationSelf-Supervised Learning	CodeCode Available	2	5
Self-Supervised Learning for Recommender Systems: A Survey	Mar 29, 2022	Recommendation SystemsSelf-Supervised Learning	CodeCode Available	2	5
Astock: A New Dataset and Automated Stock Trading based on Stock-specific News Analyzing Model	Jun 14, 2022	Decision MakingNews Classification	CodeCode Available	2	5
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture	Jan 19, 2023	Depth EstimationDepth Prediction	CodeCode Available	2	5
Self-Supervised Learning for Real-World Super-Resolution from Dual and Multiple Zoomed Observations	May 3, 2024	Optical Flow EstimationReference-based Super-Resolution	CodeCode Available	2	5
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation	Feb 24, 2022	Audio Deepfake DetectionData Augmentation	CodeCode Available	1	5
AutoNovel: Automatically Discovering and Learning Novel Visual Categories	Jun 29, 2021	ClusteringImage Clustering	CodeCode Available	1	5
Adaptive Graph Contrastive Learning for Recommendation	May 18, 2023	Collaborative FilteringContrastive Learning	CodeCode Available	1	5
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation	Dec 5, 2023	Self-Supervised LearningSpeech-to-Speech Translation	CodeCode Available	1	5
CLARA: Multilingual Contrastive Learning for Audio Representation Acquisition	Oct 18, 2023	Audio ClassificationContrastive Learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 4 of 101Next →

All datasets DABS STL-10 CIFAR10 cifar100 ImageNet-100 (TEMI Split)TinyImageNet CIFAR-10 CIFAR-100 CREMA-D Tiny ImageNet

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Pretraining: None	Images & Text	57.5	—	Unverified
2	Pretraining: ShED	Images & Text	54.3	—	Unverified
3	Pretraining: e-Mix	Images & Text	48.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	Accuracy	91.7	—	Unverified
2	ResNet18	Accuracy	91.02	—	Unverified
3	MV-MR	Accuracy	89.67	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	average top-1 classification accuracy	93.89	—	Unverified
2	ResNet18	average top-1 classification accuracy	92.58	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	average top-1 classification accuracy	72.51	—	Unverified
2	ResNet18	average top-1 classification accuracy	69.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet50)	Top-1 Accuracy	82.64	—	Unverified
2	CorInfomax (ResNet18)	Top-1 Accuracy	80.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet50	average top-1 classification accuracy	51.84	—	Unverified
2	ResNet18	average top-1 classification accuracy	51.67	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet18)	Top-1 Accuracy	93.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet18)	Top-1 Accuracy	71.61	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Hybrid BYOL-S/CvT	Accuracy	67.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CorInfomax (ResNet50)	Top-1 Accuracy	54.86	—	Unverified