zero-shot-classification

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 422 papers

Title	Date	Tasks	Status	Hype
Multimodal Whole Slide Foundation Model for Pathology	Nov 29, 2024	Cross-Modal Retrievalmodel	CodeCode Available	4
Multi-label Cluster Discrimination for Visual Representation Learning	Jul 24, 2024	Contrastive LearningImage-text Retrieval	CodeCode Available	4
Long-CLIP: Unlocking the Long-Text Capability of CLIP	Mar 22, 2024	Image GenerationImage Retrieval	CodeCode Available	4
FG-CLIP: Fine-Grained Visual and Textual Alignment	May 8, 2025	Image-text Retrievalobject-detection	CodeCode Available	4
A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges	Jan 4, 2025	FairnessHallucination	CodeCode Available	4
LLM-Pruner: On the Structural Pruning of Large Language Models	May 19, 2023	Text Generationzero-shot-classification	CodeCode Available	3
RWKV-CLIP: A Robust Vision-Language Representation Learner	Jun 11, 2024	Image-text RetrievalRepresentation Learning	CodeCode Available	2
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models	Feb 19, 2024	Adversarial DefenseMultimodal Deep Learning	CodeCode Available	2
TabLLM: Few-shot Classification of Tabular Data with Large Language Models	Oct 19, 2022	ClassificationDeep Learning	CodeCode Available	2
Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding	Jan 24, 2025	AnatomyContrastive Learning	CodeCode Available	2
Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP	Jun 25, 2024	cross-modal alignmentImage Classification	CodeCode Available	2
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing	Jun 19, 2023	ClassificationCross-Modal Retrieval	CodeCode Available	2
CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation	Apr 30, 2024	MambaState Space Models	CodeCode Available	2
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing	Jun 20, 2023	Cross-Modal RetrievalImage Retrieval	CodeCode Available	2
Boosting Vision-Language Models for Histopathology Classification: Predict all at once	Sep 3, 2024	Allzero-shot-classification	CodeCode Available	2
VeCLIP: Improving CLIP Training via Visual-enriched Captions	Oct 11, 2023	Image-text RetrievalRetrieval	CodeCode Available	2
Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification	Sep 1, 2024	Scene ClassificationTransductive Zero-Shot Classification	CodeCode Available	2
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning	May 31, 2023	Decision MakingGeneral Knowledge	CodeCode Available	2
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation	Nov 15, 2024	Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation	CodeCode Available	2
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature	Jan 13, 2025	ArticlesImage-text Retrieval	CodeCode Available	2
CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification	Feb 27, 2024	ClassificationDiagnostic	CodeCode Available	2
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models	May 30, 2025	ClassificationDisaster Response	CodeCode Available	2
Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner	May 16, 2025	Cross-Modal RetrievalDiagnostic	CodeCode Available	2
DiffCLIP: Differential Attention Meets CLIP	Mar 9, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding	May 14, 2023	3D Classification3D Point Cloud Classification	CodeCode Available	2

Show:10 25 50

← PrevPage 1 of 17Next →

No leaderboard results yet.