| CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination | Aug 18, 2024 | Knowledge DistillationTransfer Learning | —Unverified | 0 |
| DC3DO: Diffusion Classifier for 3D Objects | Aug 13, 2024 | 3D Object ClassificationClassification | CodeCode Available | 1 |
| Efficient Test-Time Prompt Tuning for Vision-Language Models | Aug 11, 2024 | Contrastive LearningPrompt Learning | —Unverified | 0 |
| Efficient and Versatile Robust Fine-Tuning of Zero-shot Models | Aug 11, 2024 | Cross-Modal Retrievalzero-shot-classification | —Unverified | 0 |
| Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts | Aug 5, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| AdaCBM: An Adaptive Concept Bottleneck Model for Explainable and Accurate Diagnosis | Aug 4, 2024 | ClassificationDiagnostic | CodeCode Available | 0 |
| Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian | Jul 30, 2024 | Document ClassificationEntity Typing | —Unverified | 0 |
| Generative Diffusion Model Bootstraps Zero-shot Classification of Fetal Ultrasound Images In Underrepresented African Populations | Jul 29, 2024 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| Adversarial Robustification via Text-to-Image Diffusion Models | Jul 26, 2024 | Adversarial Robustnesszero-shot-classification | CodeCode Available | 1 |
| I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition | Jul 25, 2024 | Instrument RecognitionRetrieval | CodeCode Available | 0 |
| Multi-label Cluster Discrimination for Visual Representation Learning | Jul 24, 2024 | Contrastive LearningImage-text Retrieval | CodeCode Available | 4 |
| Multi-modal Relation Distillation for Unified 3D Representation Learning | Jul 19, 2024 | RelationRepresentation Learning | —Unverified | 0 |
| ModalChorus: Visual Probing and Alignment of Multi-modal Embeddings via Modal Fusion Map | Jul 17, 2024 | Cross-Modal RetrievalDimensionality Reduction | CodeCode Available | 0 |
| CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging | Jul 10, 2024 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| Semantic Compositions Enhance Vision-Language Contrastive Learning | Jul 1, 2024 | ClassificationContrastive Learning | —Unverified | 0 |
| Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP | Jun 25, 2024 | cross-modal alignmentImage Classification | CodeCode Available | 2 |
| At First Sight: Zero-Shot Classification of Astronomical Images with Large Multimodal Models | Jun 24, 2024 | AstronomyClassification | —Unverified | 0 |
| A Simple Framework for Open-Vocabulary Zero-Shot Segmentation | Jun 23, 2024 | Representation Learningzero-shot-classification | —Unverified | 0 |
| BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM | Jun 17, 2024 | Continual Pretrainingzero-shot-classification | —Unverified | 0 |
| Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition | Jun 13, 2024 | Retrievalzero-shot-classification | CodeCode Available | 1 |
| RWKV-CLIP: A Robust Vision-Language Representation Learner | Jun 11, 2024 | Image-text RetrievalRepresentation Learning | CodeCode Available | 2 |
| Understanding Visual Concepts Across Models | Jun 11, 2024 | Image Generationobject-detection | CodeCode Available | 0 |
| CountCLIP -- [Re] Teaching CLIP to Count to Ten | Jun 5, 2024 | zero-shot-classificationZero-Shot Counting | CodeCode Available | 1 |
| SLANT: Spurious Logo ANalysis Toolkit | Jun 3, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| Multi-Modal Generative Embedding Model | May 29, 2024 | Caption GenerationCross-Modal Retrieval | —Unverified | 0 |