SOTAVerified

Contrastive Learning

Contrastive Learning is a deep learning technique for unsupervised representation learning. The goal is to learn a representation of data such that similar instances are close together in the representation space, while dissimilar instances are far apart.

It has been shown to be effective in various computer vision and natural language processing tasks, including image retrieval, zero-shot learning, and cross-modal retrieval. In these tasks, the learned representations can be used as features for downstream tasks such as classification and clustering.

(Image credit: Schroff et al. 2015)

Papers

Showing 251300 of 6661 papers

TitleStatusHype
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive LearningCode1
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text RecognitionCode1
EEG-CLIP : Learning EEG representations from natural language descriptionsCode1
FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data ClassificationCode1
Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain GeneralizationCode1
Variational Bayesian Personalized RankingCode1
LuSeg: Efficient Negative and Positive Obstacles Segmentation via Contrast-Driven Multi-Modal Feature Fusion on the LunarCode1
Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in HistopathologyCode1
Aligning Text to Image in Diffusion Models is Easier Than You ThinkCode1
TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion ModelsCode1
Frequency-Based Alignment of EEG and Audio Signals Using Contrastive Learning and SincNet for Auditory Attention DetectionCode1
Cross-modal Causal Relation Alignment for Video Question GroundingCode1
Bridging Spectral-wise and Multi-spectral Depth Estimation via Geometry-guided Contrastive LearningCode1
Your contrastive learning problem is secretly a distribution alignment problemCode1
Snoopy: Effective and Efficient Semantic Join Discovery via Proxy ColumnsCode1
SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-trainingCode1
Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme DetectionCode1
Myna: Masking-Based Contrastive Learning of Musical RepresentationsCode1
MVCNet: Multi-View Contrastive Network for Motor Imagery ClassificationCode1
Toward Foundational Model for Sleep Analysis Using a Multimodal Hybrid Self-Supervised Learning FrameworkCode1
Following the Autoregressive Nature of LLM Embeddings via Compression and AlignmentCode1
MC2SleepNet: Multi-modal Cross-masking with Contrastive Learning for Sleep Stage ClassificationCode1
Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex InteractionsCode1
MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI ClassificationCode1
Learning Clustering-based Prototypes for Compositional Zero-shot LearningCode1
Hierarchical Consensus Network for Multiview Feature LearningCode1
T-SCEND: Test-time Scalable MCTS-enhanced Diffusion ModelCode1
CycleGuardian: A Framework for Automatic RespiratorySound classification Based on Improved Deep clustering and Contrastive LearningCode1
Prostate-Specific Foundation Models for Enhanced Detection of Clinically Significant CancerCode1
Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential RecommendationCode1
Low-rank Prompt Interaction for Continual Vision-Language RetrievalCode1
Leveraging Textual Anatomical Knowledge for Class-Imbalanced Semi-Supervised Multi-Organ SegmentationCode1
MixRec: Individual and Collective Mixing Empowers Data Augmentation for Recommender SystemsCode1
Assisting Mathematical Formalization with A Learning-based Premise RetrieverCode1
MedFILIP: Medical Fine-grained Language-Image Pre-trainingCode1
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight DetectionCode1
AIRCHITECT v2: Learning the Hardware Accelerator Design Space through Unified RepresentationsCode1
A Simple Graph Contrastive Learning Framework for Short Text ClassificationCode1
Towards Robust and Realistic Human Pose Estimation via WiFi SignalsCode1
Uncertainty-aware Knowledge TracingCode1
AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR DataCode1
Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMsCode1
Dual-level Adaptive Incongruity-enhanced Model for Multimodal Sarcasm DetectionCode1
Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight DetectionCode1
Multimodal Contrastive Representation Learning in Augmented Biomedical Knowledge GraphsCode1
MADGEN: Mass-Spec attends to De Novo Molecular generationCode1
Relation3D : Enhancing Relation Modeling for Point Cloud Instance SegmentationCode1
SmartCLIP: Modular Vision-language Alignment with Identification GuaranteesCode1
Frequency-Masked Embedding Inference: A Non-Contrastive Approach for Time Series Representation LearningCode1
EraseAnything: Enabling Concept Erasure in Rectified Flow TransformersCode1
Show:102550
← PrevPage 6 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ResNet50ImageNet Top-1 Accuracy73.6Unverified
2ResNet50ImageNet Top-1 Accuracy73Unverified
3ResNet50ImageNet Top-1 Accuracy71.1Unverified
4ResNet50ImageNet Top-1 Accuracy69.3Unverified
5ResNet50 (v2)ImageNet Top-1 Accuracy67.6Unverified
6ResNet50 (v2)ImageNet Top-1 Accuracy63.8Unverified
7ResNet50ImageNet Top-1 Accuracy63.6Unverified
8ResNet50ImageNet Top-1 Accuracy61.5Unverified
9ResNet50ImageNet Top-1 Accuracy61.5Unverified
10ResNet50 (4×)ImageNet Top-1 Accuracy61.3Unverified
#ModelMetricClaimedVerifiedStatus
110..5sec1Unverified
#ModelMetricClaimedVerifiedStatus
1IPCL (ResNet18)Accuracy (Top-1)84.77Unverified
#ModelMetricClaimedVerifiedStatus
1IPCL (ResNet18)Accuracy (Top-1)85.55Unverified