| FLAVARS: A Multimodal Foundational Language and Vision Alignment Model for Remote Sensing | Jan 14, 2025 | ClassificationContrastive Learning | —Unverified | 0 |
| Generating Visual Representations for Zero-Shot Classification | Aug 23, 2017 | ClassificationGeneral Classification | —Unverified | 0 |
| Adapting Dual-encoder Vision-language Models for Paraphrased Retrieval | May 6, 2024 | Image RetrievalLanguage Modeling | —Unverified | 0 |
| Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection | Oct 28, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| CLAREL: Classification via retrieval loss for zero-shot learning | May 31, 2019 | ClassificationGeneral Classification | —Unverified | 0 |
| CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs | Aug 19, 2024 | Hallucinationzero-shot-classification | —Unverified | 0 |
| CLIP with Quality Captions: A Strong Pretraining for Vision Tasks | May 14, 2024 | Depth Estimationobject-detection | —Unverified | 0 |
| AVGZSLNet: Audio-Visual Generalized Zero-Shot Learning by Reconstructing Label Features from Multi-Modal Embeddings | May 27, 2020 | DecoderGeneralized Zero-Shot Learning | —Unverified | 0 |
| Hard Negative Mining for Metric Learning Based Zero-Shot Classification | Aug 26, 2016 | ClassificationDomain Adaptation | —Unverified | 0 |
| Fill the Gap: Quantifying and Reducing the Modality Gap in Image-Text Representation Learning | May 6, 2025 | Representation Learningzero-shot-classification | —Unverified | 0 |