SOTAVerified

Self-Supervised Learning

Self-Supervised Learning is proposed for utilizing unlabeled data with the success of supervised learning. Producing a dataset with good labels is expensive, while unlabeled data is being generated all the time. The motivation of Self-Supervised Learning is to make use of the large amount of unlabeled data. The main idea of Self-Supervised Learning is to generate the labels from unlabeled data, according to the structure or characteristics of the data itself, and then train on this unsupervised data in a supervised manner. Self-Supervised Learning is wildly used in representation learning to make a model learn the latent features of the data. This technique is often employed in computer vision, video processing and robot control.

Source: Self-supervised Point Set Local Descriptors for Point Cloud Registration

Image source: LeCun

Papers

Showing 27012725 of 5044 papers

TitleStatusHype
Video as the New Language for Real-World Decision Making0
Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound0
Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition0
Video Representation Learning by Recognizing Temporal Transformations0
Video Transformers: A Survey0
Video Understanding as Machine Translation0
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding0
VieSum: How Robust Are Transformer-based Models on Vietnamese Summarization?0
VietASR: Achieving Industry-level Vietnamese ASR with 50-hour labeled data and Large-Scale Speech Pretraining0
ViewMix: Augmentation for Robust Representation in Self-Supervised Learning0
ViewNet: Unsupervised Viewpoint Estimation from Conditional Generation0
Views Can Be Deceiving: Improved SSL Through Feature Space Augmentation0
VIGraph: Generative Self-supervised Learning for Class-Imbalanced Node Classification0
Vi-MIX FOR SELF-SUPERVISED VIDEO REPRESENTATION0
Virtual Node Generation for Node Classification in Sparsely-Labeled Graphs0
Visible and infrared self-supervised fusion trained on a single example0
Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft0
Vision Learners Meet Web Image-Text Pairs0
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision0
Vision Transformers: State of the Art and Research Challenges0
Visual Lexicon: Rich Image Features in Language Space0
Visually Guided Self Supervised Learning of Speech Representations0
Visual Representation Learning with Stochastic Frame Prediction0
Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations0
ViTAR: Vision Transformer with Any Resolution0
Show:102550
← PrevPage 109 of 202Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Pretraining: NoneImages & Text57.5Unverified
2Pretraining: ShEDImages & Text54.3Unverified
3Pretraining: e-MixImages & Text48.9Unverified
#ModelMetricClaimedVerifiedStatus
1ResNet50Accuracy91.7Unverified
2ResNet18Accuracy91.02Unverified
3MV-MRAccuracy89.67Unverified
#ModelMetricClaimedVerifiedStatus
1ResNet50average top-1 classification accuracy93.89Unverified
2ResNet18average top-1 classification accuracy92.58Unverified
#ModelMetricClaimedVerifiedStatus
1ResNet50average top-1 classification accuracy72.51Unverified
2ResNet18average top-1 classification accuracy69.31Unverified
#ModelMetricClaimedVerifiedStatus
1CorInfomax (ResNet50)Top-1 Accuracy82.64Unverified
2CorInfomax (ResNet18)Top-1 Accuracy80.48Unverified
#ModelMetricClaimedVerifiedStatus
1ResNet50average top-1 classification accuracy51.84Unverified
2ResNet18average top-1 classification accuracy51.67Unverified
#ModelMetricClaimedVerifiedStatus
1CorInfomax (ResNet18)Top-1 Accuracy93.18Unverified
#ModelMetricClaimedVerifiedStatus
1CorInfomax (ResNet18)Top-1 Accuracy71.61Unverified
#ModelMetricClaimedVerifiedStatus
1Hybrid BYOL-S/CvTAccuracy67.2Unverified
#ModelMetricClaimedVerifiedStatus
1CorInfomax (ResNet50)Top-1 Accuracy54.86Unverified