SOTAVerified

Self-Supervised Learning

Self-Supervised Learning is proposed for utilizing unlabeled data with the success of supervised learning. Producing a dataset with good labels is expensive, while unlabeled data is being generated all the time. The motivation of Self-Supervised Learning is to make use of the large amount of unlabeled data. The main idea of Self-Supervised Learning is to generate the labels from unlabeled data, according to the structure or characteristics of the data itself, and then train on this unsupervised data in a supervised manner. Self-Supervised Learning is wildly used in representation learning to make a model learn the latent features of the data. This technique is often employed in computer vision, video processing and robot control.

Source: Self-supervised Point Set Local Descriptors for Point Cloud Registration

Image source: LeCun

Papers

Showing 150 of 5044 papers

TitleStatusHype
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec TransformerCode9
Metis: A Foundation Speech Generation Model with Masked Generative Pre-trainingCode9
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and PlanningCode7
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image AnalysisCode7
What's Behind the Mask: Understanding Masked Graph Modeling for Graph AutoencodersCode6
Transformers without NormalizationCode5
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion EncodingCode5
Learning to (Learn at Test Time): RNNs with Expressive Hidden StatesCode5
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You ThinkCode5
Awesome Multi-modal Object TrackingCode5
Know Your Self-supervised Learning: A Survey on Image-based Generative and Discriminative TrainingCode5
SSL4EO-L: Datasets and Foundation Models for Landsat ImageryCode4
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorchCode4
Sonata: Self-Supervised Learning of Reliable Point RepresentationsCode4
GigaAM: Efficient Self-Supervised Learner for Speech RecognitionCode4
A Survey on Large Language Models for RecommendationCode4
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNNCode4
Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian NoiseCode4
Multimodal Whole Slide Foundation Model for PathologyCode4
A Framework For Contrastive Self-Supervised Learning And Designing A New ApproachCode4
TSLANet: Rethinking Transformers for Time Series Representation LearningCode3
STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor ScenesCode3
The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic SpeechCode3
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked AutoencodersCode3
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language PretrainingCode3
Robust and Efficient Medical Imaging with Self-SupervisionCode3
Pushing the limits of raw waveform speaker recognitionCode3
SARATR-X: Toward Building A Foundation Model for SAR Target RecognitionCode3
Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN TicketCode3
VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image AnalysisCode3
Calibre: Towards Fair and Accurate Personalized Federated Learning with Self-Supervised LearningCode3
Moving Object Segmentation: All You Need Is SAM (and Flow)Code3
Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3DCode3
Accelerating Goal-Conditioned RL Algorithms and ResearchCode3
MTP: Advancing Remote Sensing Foundation Model via Multi-Task PretrainingCode3
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMsCode3
emotion2vec: Self-Supervised Pre-Training for Speech Emotion RepresentationCode3
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf modelsCode3
Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised ModelsCode3
EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone TrainingCode3
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based ApproachCode3
Leveraging Self-Supervised Learning for Speaker DiarizationCode3
EAT: Self-Supervised Pre-Training with Efficient Audio TransformerCode3
Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual TasksCode3
A Survey on Self-Supervised Learning for Non-Sequential Tabular DataCode3
EEGPT: Pretrained Transformer for Universal and Reliable Representation of EEG SignalsCode3
Emergence of Segmentation with Minimalistic White-Box TransformersCode3
MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector QuantizationCode3
How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything ModelCode3
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech RepresentationsCode3
Show:102550
← PrevPage 1 of 101Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Pretraining: NoneImages & Text57.5Unverified
2Pretraining: ShEDImages & Text54.3Unverified
3Pretraining: e-MixImages & Text48.9Unverified
#ModelMetricClaimedVerifiedStatus
1ResNet50Accuracy91.7Unverified
2ResNet18Accuracy91.02Unverified
3MV-MRAccuracy89.67Unverified
#ModelMetricClaimedVerifiedStatus
1ResNet50average top-1 classification accuracy93.89Unverified
2ResNet18average top-1 classification accuracy92.58Unverified
#ModelMetricClaimedVerifiedStatus
1ResNet50average top-1 classification accuracy72.51Unverified
2ResNet18average top-1 classification accuracy69.31Unverified
#ModelMetricClaimedVerifiedStatus
1CorInfomax (ResNet50)Top-1 Accuracy82.64Unverified
2CorInfomax (ResNet18)Top-1 Accuracy80.48Unverified
#ModelMetricClaimedVerifiedStatus
1ResNet50average top-1 classification accuracy51.84Unverified
2ResNet18average top-1 classification accuracy51.67Unverified
#ModelMetricClaimedVerifiedStatus
1CorInfomax (ResNet18)Top-1 Accuracy93.18Unverified
#ModelMetricClaimedVerifiedStatus
1CorInfomax (ResNet18)Top-1 Accuracy71.61Unverified
#ModelMetricClaimedVerifiedStatus
1Hybrid BYOL-S/CvTAccuracy67.2Unverified
#ModelMetricClaimedVerifiedStatus
1CorInfomax (ResNet50)Top-1 Accuracy54.86Unverified