SOTAVerified

Contrastive Learning

Contrastive Learning is a deep learning technique for unsupervised representation learning. The goal is to learn a representation of data such that similar instances are close together in the representation space, while dissimilar instances are far apart.

It has been shown to be effective in various computer vision and natural language processing tasks, including image retrieval, zero-shot learning, and cross-modal retrieval. In these tasks, the learned representations can be used as features for downstream tasks such as classification and clustering.

(Image credit: Schroff et al. 2015)

Papers

Showing 51100 of 6661 papers

TitleStatusHype
TCSinger 2: Customizable Multilingual Zero-shot Singing Voice SynthesisCode2
Think Twice Before You Act: Enhancing Agent Behavioral Safety with Thought CorrectionCode2
SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought ReasoningCode2
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video SegmentationCode2
Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object DetectionCode2
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data PretrainingCode2
Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image AnalysisCode2
Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report GenerationCode2
Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series ClassificationCode2
Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language ModelsCode2
Without Paired Labeled Data: An End-to-End Self-Supervised Paradigm for UAV-View Geo-LocalizationCode2
MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language PretrainingCode2
Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language ModelsCode2
Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image UnderstandingCode2
Avoiding Shortcuts: Enhancing Channel-Robust Specific Emitter Identification via Single-Source Domain GeneralizationCode2
Vision Foundation Models for Computed TomographyCode2
Revolutionizing Encrypted Traffic Classification with MH-Net: A Multi-View Heterogeneous Graph ModelCode2
Personalized Representation from Personalized GenerationCode2
Gramian Multimodal Representation Learning and AlignmentCode2
UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging ModalitiesCode2
LamRA: Large Multimodal Model as Your Advanced Retrieval AssistantCode2
SADG: Segment Any Dynamic Gaussian Without Object TrackersCode2
MWFormer: Multi-Weather Image Restoration Using Degradation-Aware TransformersCode2
MCL: Multi-view Enhanced Contrastive Learning for Chest X-ray Report GenerationCode2
Learning General-Purpose Biomedical Volume Representations using Randomized SynthesisCode2
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive LearningCode2
PaPaGei: Open Foundation Models for Optical Physiological SignalsCode2
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language ModelsCode2
Contrastive learning of cell state dynamics in response to perturbationsCode2
BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRICode2
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and TrainingCode2
BEVLoc: Cross-View Localization and Matching via Birds-Eye-View SynthesisCode2
RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language ModelsCode2
Self-Supervised Any-Point Tracking by Contrastive Random WalksCode2
DetailCLIP: Detail-Oriented CLIP for Fine-Grained TasksCode2
EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysisCode2
EasyRec: Simple yet Effective Language Models for RecommendationCode2
ECG-Chat: A Large ECG-Language Model for Cardiac Disease DiagnosisCode2
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in AlignmentCode2
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local SimilaritiesCode2
Contrastive Learning of Asset Embeddings from Financial Time SeriesCode2
AddressCLIP: Empowering Vision-Language Models for City-wide Image Address LocalizationCode2
TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete DataCode2
4D Contrastive Superflows are Dense 3D Representation LearnersCode2
Training-free CryoET Tomogram SegmentationCode2
Language Representations Can be What Recommenders Need: Findings and PotentialsCode2
A Unified Framework for 3D Scene UnderstandingCode2
Robust and Reliable Early-Stage Website Fingerprinting Attacks via Spatial-Temporal Distribution AnalysisCode2
Denoising as Adaptation: Noise-Space Domain Adaptation for Image RestorationCode2
DiffMM: Multi-Modal Diffusion Model for RecommendationCode2
Show:102550
← PrevPage 2 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ResNet50ImageNet Top-1 Accuracy73.6Unverified
2ResNet50ImageNet Top-1 Accuracy73Unverified
3ResNet50ImageNet Top-1 Accuracy71.1Unverified
4ResNet50ImageNet Top-1 Accuracy69.3Unverified
5ResNet50 (v2)ImageNet Top-1 Accuracy67.6Unverified
6ResNet50 (v2)ImageNet Top-1 Accuracy63.8Unverified
7ResNet50ImageNet Top-1 Accuracy63.6Unverified
8ResNet50ImageNet Top-1 Accuracy61.5Unverified
9ResNet50ImageNet Top-1 Accuracy61.5Unverified
10ResNet50 (4×)ImageNet Top-1 Accuracy61.3Unverified
#ModelMetricClaimedVerifiedStatus
110..5sec1Unverified
#ModelMetricClaimedVerifiedStatus
1IPCL (ResNet18)Accuracy (Top-1)84.77Unverified
#ModelMetricClaimedVerifiedStatus
1IPCL (ResNet18)Accuracy (Top-1)85.55Unverified