zero-shot-classification

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 422 papers

Title	Date	Tasks	Status	Hype	Score
Multimodal Whole Slide Foundation Model for Pathology	Nov 29, 2024	Cross-Modal Retrievalmodel	CodeCode Available	4	5
Multi-label Cluster Discrimination for Visual Representation Learning	Jul 24, 2024	Contrastive LearningImage-text Retrieval	CodeCode Available	4	5
Long-CLIP: Unlocking the Long-Text Capability of CLIP	Mar 22, 2024	Image GenerationImage Retrieval	CodeCode Available	4	5
A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges	Jan 4, 2025	FairnessHallucination	CodeCode Available	4	5
FG-CLIP: Fine-Grained Visual and Textual Alignment	May 8, 2025	Image-text Retrievalobject-detection	CodeCode Available	4	5
LLM-Pruner: On the Structural Pruning of Large Language Models	May 19, 2023	Text Generationzero-shot-classification	CodeCode Available	3	5
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation	Nov 15, 2024	Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation	CodeCode Available	2	5
TabLLM: Few-shot Classification of Tabular Data with Large Language Models	Oct 19, 2022	ClassificationDeep Learning	CodeCode Available	2	5
VeCLIP: Improving CLIP Training via Visual-enriched Captions	Oct 11, 2023	Image-text RetrievalRetrieval	CodeCode Available	2	5
CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation	Apr 30, 2024	MambaState Space Models	CodeCode Available	2	5
Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification	Sep 1, 2024	Scene ClassificationTransductive Zero-Shot Classification	CodeCode Available	2	5
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing	Jun 20, 2023	Cross-Modal RetrievalImage Retrieval	CodeCode Available	2	5
DiffCLIP: Differential Attention Meets CLIP	Mar 9, 2025	Language ModelingLanguage Modelling	CodeCode Available	2	5
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models	May 30, 2025	ClassificationDisaster Response	CodeCode Available	2	5
RWKV-CLIP: A Robust Vision-Language Representation Learner	Jun 11, 2024	Image-text RetrievalRepresentation Learning	CodeCode Available	2	5
Uni3D: Exploring Unified 3D Representation at Scale	Oct 10, 2023	3D Object ClassificationRetrieval	CodeCode Available	2	5
Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP	Jun 25, 2024	cross-modal alignmentImage Classification	CodeCode Available	2	5
Boosting Vision-Language Models for Histopathology Classification: Predict all at once	Sep 3, 2024	Allzero-shot-classification	CodeCode Available	2	5
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning	May 31, 2023	Decision MakingGeneral Knowledge	CodeCode Available	2	5
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding	May 14, 2023	3D Classification3D Point Cloud Classification	CodeCode Available	2	5
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation	Dec 7, 2022	Semantic Segmentationzero-shot-classification	CodeCode Available	2	5
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement	Mar 11, 2024	Clinical KnowledgeDescriptive	CodeCode Available	2	5
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing	Jun 19, 2023	ClassificationCross-Modal Retrieval	CodeCode Available	2	5
CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification	Feb 27, 2024	ClassificationDiagnostic	CodeCode Available	2	5
Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding	Jan 24, 2025	AnatomyContrastive Learning	CodeCode Available	2	5
Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner	May 16, 2025	Cross-Modal RetrievalDiagnostic	CodeCode Available	2	5
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature	Jan 13, 2025	ArticlesImage-text Retrieval	CodeCode Available	2	5
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models	Feb 19, 2024	Adversarial DefenseMultimodal Deep Learning	CodeCode Available	2	5
Your Diffusion Model is Secretly a Zero-Shot Classifier	Mar 28, 2023	Domain GeneralizationFine-Grained Image Classification	CodeCode Available	2	5
Advancing Medical Representation Learning Through High-Quality Data	Mar 18, 2025	Representation Learningzero-shot-classification	CodeCode Available	1	5
ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models	Oct 27, 2023	Column Type AnnotationTable annotation	CodeCode Available	1	5
CountCLIP -- [Re] Teaching CLIP to Count to Ten	Jun 5, 2024	zero-shot-classificationZero-Shot Counting	CodeCode Available	1	5
Controlling Latent Diffusion Using Latent CLIP	Mar 11, 2025	DenoisingDescriptive	CodeCode Available	1	5
Contrastive Language-Image Pre-training for the Italian Language	Aug 19, 2021	Image RetrievalMulti-label zero-shot learning	CodeCode Available	1	5
Florence: A New Foundation Model for Computer Vision	Nov 22, 2021	Action ClassificationAction Recognition	CodeCode Available	1	5
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation	Mar 19, 2024	DecoderInstance Segmentation	CodeCode Available	1	5
CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification	Feb 25, 2025	Denoisingzero-shot-classification	CodeCode Available	1	5
Exploring Vision-Language Models for Imbalanced Learning	Apr 4, 2023	Decoderzero-shot-classification	CodeCode Available	1	5
CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections	Nov 28, 2024	image-classificationImage Classification	CodeCode Available	1	5
CLIP-Lite: Information Efficient Visual Representation Learning with Language Supervision	Dec 14, 2021	Contrastive LearningRepresentation Learning	CodeCode Available	1	5
CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object Representation	Feb 27, 2025	Image-text matchingObject	CodeCode Available	1	5
CyCLIP: Cyclic Contrastive Language-Image Pretraining	May 28, 2022	Representation LearningVisual Reasoning	CodeCode Available	1	5
From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection	May 19, 2025	feature selectionOut-of-Distribution Generalization	CodeCode Available	1	5
Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models	Jun 10, 2025	Contrastive LearningImage-text matching	CodeCode Available	1	5
DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection	Oct 2, 2023	Novel Object DetectionObject	CodeCode Available	1	5
EmoCLIP: A Vision-Language Method for Zero-Shot Video Facial Expression Recognition	Oct 25, 2023	Facial Expression RecognitionFacial Expression Recognition (FER)	CodeCode Available	1	5
Discovering Human Interactions With Novel Objects via Zero-Shot Learning	Jun 1, 2020	Human-Object Interaction DetectionObject	CodeCode Available	1	5
CLIP-Guided Source-Free Object Detection in Aerial Images	Jan 10, 2024	Domain AdaptationObject	CodeCode Available	1	5
CLIPArTT: Adaptation of CLIP to New Domains at Test Time	May 1, 2024	Pseudo LabelTest-time Adaptation	CodeCode Available	1	5
Discriminative Region-based Multi-Label Zero-Shot Learning	Aug 20, 2021	Image RetrievalMulti-label zero-shot learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 1 of 9Next →

No leaderboard results yet.