| From Unimodal to Multimodal: Scaling up Projectors to Align Modalities | Sep 28, 2024 | Image-text RetrievalSemantic Similarity | CodeCode Available | 0 |
| CleanerCLIP: Fine-grained Counterfactual Semantic Augmentation for Backdoor Defense in Contrastive Learning | Sep 26, 2024 | backdoor defenseContrastive Learning | —Unverified | 0 |
| Benchmarking LLMs in Political Content Text-Annotation: Proof-of-Concept with Toxicity and Incivility Data | Sep 15, 2024 | Benchmarkingtext annotation | —Unverified | 0 |
| Pushing the Limits of Vision-Language Models in Remote Sensing without Human Annotations | Sep 11, 2024 | Image-text RetrievalText Retrieval | —Unverified | 0 |
| An Art-centric perspective on AI-based content moderation of nudity | Sep 10, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| Deep Clustering of Remote Sensing Scenes through Heterogeneous Transfer Learning | Sep 5, 2024 | ClusteringDeep Clustering | —Unverified | 0 |
| Have Large Vision-Language Models Mastered Art History? | Sep 5, 2024 | Classificationimage-classification | —Unverified | 0 |
| Optimizing CLIP Models for Image Retrieval with Maintained Joint-Embedding Alignment | Sep 3, 2024 | Image RetrievalRetrieval | CodeCode Available | 0 |
| Large Language Models versus Classical Machine Learning: Performance in COVID-19 Mortality Prediction Using High-Dimensional Tabular Data | Sep 2, 2024 | Mortality Predictionzero-shot-classification | CodeCode Available | 0 |
| EEG-Language Modeling for Pathology Detection | Sep 2, 2024 | Contrastive LearningEEG | —Unverified | 0 |
| PoliPrompt: A High-Performance Cost-Effective LLM-Based Text Classification Framework for Political Science | Sep 2, 2024 | ClassificationFeature Engineering | —Unverified | 0 |
| Multi-label Zero-Shot Audio Classification with Temporal Attention | Aug 31, 2024 | Audio ClassificationClassification | —Unverified | 0 |
| Visual Prompt Engineering for Medical Vision Language Models in Radiology | Aug 28, 2024 | Classificationimage-classification | —Unverified | 0 |
| Online Zero-Shot Classification with CLIP | Aug 23, 2024 | Classificationzero-shot-classification | CodeCode Available | 0 |
| PRG: Prompt-Based Distillation Without Annotation via Proxy Relational Graph | Aug 22, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| XDT-CXR: Investigating Cross-Disease Transferability in Zero-Shot Binary Classification of Chest X-Rays | Aug 21, 2024 | Binary ClassificationDiagnostic | CodeCode Available | 0 |
| CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs | Aug 19, 2024 | Hallucinationzero-shot-classification | —Unverified | 0 |
| CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination | Aug 18, 2024 | Knowledge DistillationTransfer Learning | —Unverified | 0 |
| Efficient and Versatile Robust Fine-Tuning of Zero-shot Models | Aug 11, 2024 | Cross-Modal Retrievalzero-shot-classification | —Unverified | 0 |
| Efficient Test-Time Prompt Tuning for Vision-Language Models | Aug 11, 2024 | Contrastive LearningPrompt Learning | —Unverified | 0 |
| Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts | Aug 5, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| AdaCBM: An Adaptive Concept Bottleneck Model for Explainable and Accurate Diagnosis | Aug 4, 2024 | ClassificationDiagnostic | CodeCode Available | 0 |
| Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian | Jul 30, 2024 | Document ClassificationEntity Typing | —Unverified | 0 |
| Generative Diffusion Model Bootstraps Zero-shot Classification of Fetal Ultrasound Images In Underrepresented African Populations | Jul 29, 2024 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition | Jul 25, 2024 | Instrument RecognitionRetrieval | CodeCode Available | 0 |
| Multi-modal Relation Distillation for Unified 3D Representation Learning | Jul 19, 2024 | RelationRepresentation Learning | —Unverified | 0 |
| ModalChorus: Visual Probing and Alignment of Multi-modal Embeddings via Modal Fusion Map | Jul 17, 2024 | Cross-Modal RetrievalDimensionality Reduction | CodeCode Available | 0 |
| CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging | Jul 10, 2024 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| Semantic Compositions Enhance Vision-Language Contrastive Learning | Jul 1, 2024 | ClassificationContrastive Learning | —Unverified | 0 |
| At First Sight: Zero-Shot Classification of Astronomical Images with Large Multimodal Models | Jun 24, 2024 | AstronomyClassification | —Unverified | 0 |
| A Simple Framework for Open-Vocabulary Zero-Shot Segmentation | Jun 23, 2024 | Representation Learningzero-shot-classification | —Unverified | 0 |
| BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM | Jun 17, 2024 | Continual Pretrainingzero-shot-classification | —Unverified | 0 |
| Understanding Visual Concepts Across Models | Jun 11, 2024 | Image Generationobject-detection | CodeCode Available | 0 |
| SLANT: Spurious Logo ANalysis Toolkit | Jun 3, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| Multi-Modal Generative Embedding Model | May 29, 2024 | Caption GenerationCross-Modal Retrieval | —Unverified | 0 |
| It's Not a Modality Gap: Characterizing and Addressing the Contrastive Gap | May 28, 2024 | image-classificationImage Classification | —Unverified | 0 |
| MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding | May 28, 2024 | 3D Classification3D Object Recognition | —Unverified | 0 |
| Listenable Maps for Zero-Shot Audio Classifiers | May 27, 2024 | Decoderzero-shot-classification | —Unverified | 0 |
| CapS-Adapter: Caption-based MultiModal Adapter in Zero-Shot Classification | May 26, 2024 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models | May 24, 2024 | Classificationimage-classification | CodeCode Available | 0 |
| BDetCLIP: Multimodal Prompting Contrastive Test-Time Backdoor Detection | May 24, 2024 | Contrastive LearningLanguage Modelling | —Unverified | 0 |
| Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations | May 23, 2024 | Contrastive LearningInstance Segmentation | CodeCode Available | 0 |
| Tuning-free Universally-Supervised Semantic Segmentation | May 23, 2024 | SegmentationSemantic Segmentation | —Unverified | 0 |
| Stylometric Watermarks for Large Language Models | May 14, 2024 | Sentencezero-shot-classification | —Unverified | 0 |
| CLIP with Quality Captions: A Strong Pretraining for Vision Tasks | May 14, 2024 | Depth Estimationobject-detection | —Unverified | 0 |
| Advanced Natural-based interaction for the ITAlian language: LLaMAntino-3-ANITA | May 11, 2024 | Computational EfficiencyLanguage Modelling | —Unverified | 0 |
| The Effect of Model Size on LLM Post-hoc Explainability via LIME | May 8, 2024 | Natural Language Inferencezero-shot-classification | CodeCode Available | 0 |
| Adapting Dual-encoder Vision-language Models for Paraphrased Retrieval | May 6, 2024 | Image RetrievalLanguage Modeling | —Unverified | 0 |
| Source-Free Domain Adaptation Guided by Vision and Vision-Language Pre-Training | May 5, 2024 | Domain AdaptationLanguage Modelling | CodeCode Available | 0 |
| ESP-Zero: Unsupervised enhancement of zero-shot classification for Extremely Sparse Point cloud | Apr 30, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |