SOTAVerified

Contrastive Learning

Contrastive Learning is a deep learning technique for unsupervised representation learning. The goal is to learn a representation of data such that similar instances are close together in the representation space, while dissimilar instances are far apart.

It has been shown to be effective in various computer vision and natural language processing tasks, including image retrieval, zero-shot learning, and cross-modal retrieval. In these tasks, the learned representations can be used as features for downstream tasks such as classification and clustering.

(Image credit: Schroff et al. 2015)

Papers

Showing 101150 of 6661 papers

TitleStatusHype
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image RetrievalCode2
Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression LearningCode2
A DeNoising FPN With Transformer R-CNN for Tiny Object DetectionCode2
Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and LanguageCode2
SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory SignalsCode2
Improved Canonicalization for Model Agnostic EquivarianceCode2
DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical AlignmentCode2
Transcriptomics-guided Slide Representation Learning in Computational PathologyCode2
HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase RecognitionCode2
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-TrainingCode2
Vision-and-Language Navigation via Causal LearningCode2
Generalized Contrastive Learning for Multi-Modal Retrieval and RankingCode2
Latent Guard: a Safety Framework for Text-to-image GenerationCode2
NeuroNet: A Novel Hybrid Self-Supervised Learning Framework for Sleep Stage Classification Using Single-Channel EEGCode2
Decoupling Static and Hierarchical Motion Perception for Referring Video SegmentationCode2
A Comprehensive Survey on Self-Supervised Learning for RecommendationCode2
GenN2N: Generative NeRF2NeRF TranslationCode2
DreamLIP: Language-Image Pre-training with Long CaptionsCode2
RAR: Retrieving And Ranking Augmented MLLMs for Visual RecognitionCode2
GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic GraspingCode2
Learning Commonality, Divergence and Variety for Unsupervised Visible-Infrared Person Re-identificationCode2
DecisionNCE: Embodied Multimodal Representations via Implicit Preference LearningCode2
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful SpaceCode2
Neighborhood-Enhanced Supervised Contrastive Learning for Collaborative FilteringCode2
DNABERT-S: Pioneering Species Differentiation with Species-Aware DNA EmbeddingsCode2
One Train for Two Tasks: An Encrypted Traffic Classification Framework Using Supervised Contrastive LearningCode2
Multi-Patch Prediction: Adapting LLMs for Time Series Representation LearningCode2
Self-Supervised Contrastive Learning for Long-term ForecastingCode2
HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion RecognitionCode2
Learn From Zoom: Decoupled Supervised Contrastive Learning For WCE Image ClassificationCode2
End-to-end Learnable Clustering for Intent Learning in RecommendationCode2
FedTGP: Trainable Global Prototypes with Adaptive-Margin-Enhanced Contrastive Learning for Data and Model Heterogeneity in Federated LearningCode2
Unsupervised Continual Anomaly Detection with Contrastively-learned PromptCode2
Learning Vision from Models Rivals Learning Vision from DataCode2
One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text PromptsCode2
FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive LearningCode2
Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot ResponseCode2
SatCLIP: Global, General-Purpose Location Embeddings with Satellite ImageryCode2
X-Pose: Detecting Any KeypointsCode2
GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localizationCode2
Detecting and Grounding Multi-Modal Media Manipulation and BeyondCode2
MedCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information RetrievalCode2
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph MatchingCode2
DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly DetectionCode2
LibAUC: A Deep Learning Library for X-Risk OptimizationCode2
Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion PriorsCode2
OpenShape: Scaling Up 3D Shape Representation Towards Open-World UnderstandingCode2
Detecting and Grounding Multi-Modal Media ManipulationCode2
RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene UnderstandingCode2
Seeing What You Said: Talking Face Generation Guided by a Lip Reading ExpertCode2
Show:102550
← PrevPage 3 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ResNet50ImageNet Top-1 Accuracy73.6Unverified
2ResNet50ImageNet Top-1 Accuracy73Unverified
3ResNet50ImageNet Top-1 Accuracy71.1Unverified
4ResNet50ImageNet Top-1 Accuracy69.3Unverified
5ResNet50 (v2)ImageNet Top-1 Accuracy67.6Unverified
6ResNet50 (v2)ImageNet Top-1 Accuracy63.8Unverified
7ResNet50ImageNet Top-1 Accuracy63.6Unverified
8ResNet50ImageNet Top-1 Accuracy61.5Unverified
9ResNet50ImageNet Top-1 Accuracy61.5Unverified
10ResNet50 (4×)ImageNet Top-1 Accuracy61.3Unverified
#ModelMetricClaimedVerifiedStatus
110..5sec1Unverified
#ModelMetricClaimedVerifiedStatus
1IPCL (ResNet18)Accuracy (Top-1)84.77Unverified
#ModelMetricClaimedVerifiedStatus
1IPCL (ResNet18)Accuracy (Top-1)85.55Unverified