SOTAVerified

zero-shot-classification

Papers

Showing 76100 of 422 papers

TitleStatusHype
Multimodal Remote Sensing Scene Classification Using VLMs and Dual-Cross Attention NetworksCode0
Perturb and Recover: Fine-tuning for Effective Backdoor Removal from CLIPCode0
Multimodal Whole Slide Foundation Model for PathologyCode4
CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image CollectionsCode1
Active Data Curation Effectively Distills Large-Scale Multimodal Models0
TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language ModelsCode1
CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic SegmentationCode1
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic SegmentationCode2
Measuring similarity between embedding spaces using induced neighborhood graphs0
NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics0
Enhancing Visual Classification using Comparative DescriptorsCode0
Asterisk*: Keep it Simple0
RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language ModelsCode1
ResiDual Transformer Alignment with Spectral Decomposition0
Active Learning for Vision-Language Models0
Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection0
Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models0
MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic ReportCode0
Assessing Open-world Forgetting in Generative Image Model Customization0
Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data?0
LLM Chain Ensembles for Scalable and Accurate Data AnnotationCode0
Interpreting and Analysing CLIP's Zero-Shot Image Classification via Mutual KnowledgeCode1
CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning0
A Unified Debiasing Approach for Vision-Language Models across Modalities and TasksCode0
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language ModelsCode0
Show:102550
← PrevPage 4 of 17Next →

No leaderboard results yet.