Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese Nov 2, 2022 Contrastive Learning image-classification
Code Code Available 55 ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models Apr 19, 2022 Fairness Few-Shot Image Classification
Code Code Available 45 AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities Nov 12, 2022 Contrastive Learning Cross-Modal Retrieval
Code Code Available 45 PromptKD: Unsupervised Prompt Distillation for Vision-Language Models Mar 5, 2024 Knowledge Distillation Prompt Engineering
Code Code Available 35 WATT: Weight Average Test-Time Adaptation of CLIP Jun 19, 2024 image-classification Image Classification
Code Code Available 25 CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction Oct 2, 2023 image-classification Image Classification
Code Code Available 25 Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Feb 11, 2021 Cross-Modal Retrieval Fine-Grained Image Classification
Code Code Available 25 What does a platypus look like? Generating customized prompts for zero-shot image classification Sep 7, 2022 Descriptive image-classification
Code Code Available 25 PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration Jun 28, 2024 image-classification Image Classification
Code Code Available 25 Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion Feb 6, 2025 image-classification Image Classification
Code Code Available 25 Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP Jun 25, 2024 cross-modal alignment Image Classification
Code Code Available 25 CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling Sep 28, 2024 image-classification Image Classification
Code Code Available 25 RemoteCLIP: A Vision Language Foundation Model for Remote Sensing Jun 19, 2023 Classification Cross-Modal Retrieval
Code Code Available 25 CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets Feb 6, 2023 Classification image-classification
Code Code Available 15 Disentangled Ontology Embedding for Zero-shot Learning Jun 8, 2022 image-classification Image Classification
Code Code Available 15 CamDiff: Camouflage Image Augmentation via Diffusion Model Apr 11, 2023 Dataset Generation Image Augmentation
Code Code Available 15 PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization Jul 27, 2023 Domain Generalization Image Classification
Code Code Available 15 Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding Jun 15, 2023 Contrastive Learning image-classification
Code Code Available 15 Zero-Shot Temporal Action Detection via Vision-Language Prompting Jul 17, 2022 Action Detection Classification
Code Code Available 15 TaxaBind: A Unified Embedding Space for Ecological Applications Nov 1, 2024 Audio Classification Cross-Modal Retrieval
Code Code Available 15 PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts Aug 2, 2023 Classification image-classification
Code Code Available 15 CCMB: A Large-scale Chinese Cross-modal Benchmark May 8, 2022 image-classification Image Classification
Code Code Available 15 Masked Unsupervised Self-training for Label-free Image Classification Jun 7, 2022 image-classification Image Classification
Code Code Available 15 General Image Descriptors for Open World Image Retrieval using ViT CLIP Oct 20, 2022 Image Retrieval Retrieval
Code Code Available 15 Post-hoc Probabilistic Vision-Language Models Dec 8, 2024 Active Learning Uncertainty Quantification
Code Code Available 15 Mind's Eye: Image Recognition by EEG via Multimodal Similarity-Keeping Contrastive Learning Jun 5, 2024 Contrastive Learning EEG
Code Code Available 15 Open-vocabulary Object Detection via Vision and Language Knowledge Distillation Apr 28, 2021 image-classification Image Classification
Code Code Available 15 Distilling Large Vision-Language Model with Out-of-Distribution Generalizability Jul 6, 2023 Few-Shot Image Classification Image Classification
Code Code Available 15 LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval Jan 1, 2023 image-classification Image Classification
Code Code Available 15 Reproducible scaling laws for contrastive language-image learning Dec 14, 2022 Image Classification Open Vocabulary Attribute Detection
Code Code Available 15 Benchmarking Knowledge-driven Zero-shot Learning Jun 29, 2021 Attribute Benchmarking
Code Code Available 15 Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning Apr 4, 2024 Contrastive Learning image-classification
Code Code Available 15 Interpreting and Analysing CLIP's Zero-Shot Image Classification via Mutual Knowledge Oct 16, 2024 Classification image-classification
Code Code Available 15 DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning Jul 4, 2022 Attribute Contrastive Learning
Code Code Available 15 Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification Mar 2, 2022 image-classification Image Classification
Code Code Available 15 Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations Jun 14, 2023 image-classification Image Classification
Code Code Available 15 FILIP: Fine-grained Interactive Language-Image Pre-Training Nov 9, 2021 image-classification Image Classification
Code Code Available 15 Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations Mar 29, 2024 image-classification Image Classification
Code Code Available 15 Can We Talk Models Into Seeing the World Differently? Mar 14, 2024 Image Captioning Image Classification
Code Code Available 15 Generative Multi-Label Zero-Shot Learning Jan 27, 2021 Attribute Generative Adversarial Network
Code Code Available 15 A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model Dec 29, 2021 image-classification Image Classification
Code Code Available 15 LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text Mar 25, 2025 Cross-Modal Retrieval Hallucination
Code Code Available 15 LiT: Zero-Shot Transfer with Locked-image text Tuning Nov 15, 2021 image-classification Image Classification
Code Code Available 15 Structure Pretraining and Prompt Tuning for Knowledge Graph Transfer Mar 3, 2023 image-classification Image Classification
Code Code Available 15 Zero-Shot Logit Adjustment Apr 25, 2022 Bayesian Inference Generalized Zero-Shot Learning
Code Code Available 15 DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models Aug 16, 2024 Domain Adaptation image-classification
Code Code Available 05 Do Vision-Language Foundational models show Robust Visual Perception? Aug 13, 2024 image-classification Image Classification
Code Code Available 05 PaLI: A Jointly-Scaled Multilingual Language-Image Model Sep 14, 2022 Decoder Few-Shot Image Classification
Code Code Available 05 Towards Difficulty-Agnostic Efficient Transfer Learning for Vision-Language Models Nov 27, 2023 General Knowledge image-classification
Code Code Available 05 Language-Driven Anchors for Zero-Shot Adversarial Robustness Jan 30, 2023 Adversarial Defense Adversarial Robustness
Code Code Available 05