| ProtoCLIP: Prototypical Contrastive Language Image Pretraining | Jun 22, 2022 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 1 |
| CyCLIP: Cyclic Contrastive Language-Image Pretraining | May 28, 2022 | Representation LearningVisual Reasoning | CodeCode Available | 1 |
| Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning | May 25, 2022 | text-classificationText Classification | CodeCode Available | 1 |
| No Token Left Behind: Explainability-Aided Image Classification and Generation | Apr 11, 2022 | image-classificationImage Classification | CodeCode Available | 1 |
| Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning | Mar 3, 2022 | Contrastive LearningFairness | CodeCode Available | 1 |
| Decoupling Zero-Shot Semantic Segmentation | Dec 15, 2021 | Open Vocabulary Semantic SegmentationSegmentation | CodeCode Available | 1 |
| CLIP-Lite: Information Efficient Visual Representation Learning with Language Supervision | Dec 14, 2021 | Contrastive LearningRepresentation Learning | CodeCode Available | 1 |
| Florence: A New Foundation Model for Computer Vision | Nov 22, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Wav2CLIP: Learning Robust Audio Representations From CLIP | Oct 21, 2021 | Cross-Modal RetrievalImage Generation | CodeCode Available | 1 |
| Zero-Shot Out-of-Distribution Detection Based on the Pre-trained Model CLIP | Sep 6, 2021 | Image DescriptionOut-of-Distribution Detection | CodeCode Available | 1 |
| Discriminative Region-based Multi-Label Zero-Shot Learning | Aug 20, 2021 | Image RetrievalMulti-label zero-shot learning | CodeCode Available | 1 |
| Contrastive Language-Image Pre-training for the Italian Language | Aug 19, 2021 | Image RetrievalMulti-label zero-shot learning | CodeCode Available | 1 |
| Toward Zero-Shot Unsupervised Image-to-Image Translation | Jul 28, 2020 | AttributeImage-to-Image Translation | CodeCode Available | 1 |
| Discovering Human Interactions With Novel Objects via Zero-Shot Learning | Jun 1, 2020 | Human-Object Interaction DetectionObject | CodeCode Available | 1 |
| Deep Learning Models for Multilingual Hate Speech Detection | Apr 14, 2020 | Deep LearningHate Speech Detection | CodeCode Available | 1 |
| Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification | Mar 17, 2020 | Action ClassificationClassification | CodeCode Available | 1 |
| Parameter Space Factorization for Zero-Shot Learning across Tasks and Languages | Jan 30, 2020 | Cross-Lingual Transfernamed-entity-recognition | CodeCode Available | 1 |
| Episode-based Prototype Generating Network for Zero-Shot Learning | Sep 8, 2019 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 1 |
| Zero-Shot Semantic Segmentation | Jun 3, 2019 | General ClassificationSegmentation | CodeCode Available | 1 |
| DEARLi: Decoupled Enhancement of Recognition and Localization for Semi-supervised Panoptic Segmentation | Jul 14, 2025 | DecoderGPU | CodeCode Available | 0 |
| Comparison of ConvNeXt and Vision-Language Models for Breast Density Assessment in Screening Mammography | Jun 16, 2025 | breast density classificationClassification | —Unverified | 0 |
| Harmonizing and Merging Source Models for CLIP-based Domain Generalization | Jun 11, 2025 | Domain Generalizationzero-shot-classification | —Unverified | 0 |
| AmorLIP: Efficient Language-Image Pretraining via Amortization | May 25, 2025 | Contrastive LearningRepresentation Learning | CodeCode Available | 0 |
| Distill CLIP (DCLIP): Enhancing Image-Text Retrieval via Cross-Modal Transformer Distillation | May 25, 2025 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| Beginning with You: Perceptual-Initialization Improves Vision-Language Representation and Alignment | May 20, 2025 | Representation LearningRetrieval | —Unverified | 0 |
| Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption | May 19, 2025 | Knowledge DistillationTest-time Adaptation | CodeCode Available | 0 |
| StarFT: Robust Fine-tuning of Zero-shot Models via Spuriosity Alignment | May 19, 2025 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| Advanced Crash Causation Analysis for Freeway Safety: A Large Language Model Approach to Identifying Key Contributing Factors | May 15, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining | May 12, 2025 | Audio captioningAudio Generation | —Unverified | 0 |
| Image Classification Using a Diffusion Model as a Pre-Training Model | May 11, 2025 | Contrastive Learningimage-classification | —Unverified | 0 |
| Fill the Gap: Quantifying and Reducing the Modality Gap in Image-Text Representation Learning | May 6, 2025 | Representation Learningzero-shot-classification | —Unverified | 0 |
| Helping Large Language Models Protect Themselves: An Enhanced Filtering and Summarization System | May 2, 2025 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| On the effectiveness of Large Language Models in the mechanical design domain | May 2, 2025 | ClassificationSentence | CodeCode Available | 0 |
| Tell Me What You Know About Sexism: Expert-LLM Interaction Strategies and Co-Created Definitions for Zero-Shot Sexism Detection | Apr 21, 2025 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Radiology with Zero-Shot Multi-Task Capability | Apr 10, 2025 | Contrastive LearningOpen Vocabulary Semantic Segmentation | —Unverified | 0 |
| Refining CLIP's Spatial Awareness: A Visual-Centric Perspective | Apr 3, 2025 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| CIBR: Cross-modal Information Bottleneck Regularization for Robust CLIP Generalization | Mar 31, 2025 | Contrastive Learningimage-classification | —Unverified | 0 |
| ViLAaD: Enhancing "Attracting and Dispersing'' Source-Free Domain Adaptation with Vision-and-Language Model | Mar 30, 2025 | Domain AdaptationLanguage Modeling | —Unverified | 0 |
| Enhancing Small Language Models for Cross-Lingual Generalized Zero-Shot Classification with Soft Prompt Tuning | Mar 25, 2025 | Cross-Lingual Transferzero-shot-classification | —Unverified | 0 |
| Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection | Mar 21, 2025 | Edge DetectionRetrieval | —Unverified | 0 |
| Bayesian Modeling of Zero-Shot Classifications for Urban Flood Detection | Mar 18, 2025 | Uncertainty Quantificationzero-shot-classification | CodeCode Available | 0 |
| Real-Time Cell Sorting with Scalable In Situ FPGA-Accelerated Deep Learning | Mar 16, 2025 | Cell DetectionClassification | CodeCode Available | 0 |
| TLAC: Two-stage LMM Augmented CLIP for Zero-Shot Classification | Mar 15, 2025 | Domain Generalizationimage-classification | CodeCode Available | 0 |
| Leveraging Vision-Language Embeddings for Zero-Shot Learning in Histopathology Images | Mar 13, 2025 | Diagnosticimage-classification | —Unverified | 0 |
| OFF-CLIP: Improving Normal Detection Confidence in Radiology CLIP with Simple Off-Diagonal Term Auto-Adjustment | Mar 3, 2025 | Anomaly LocalizationClassification | CodeCode Available | 0 |
| A Zero-Shot Learning Approach for Ephemeral Gully Detection from Remote Sensing using Vision Language Models | Mar 3, 2025 | Transfer Learningzero-shot-classification | —Unverified | 0 |
| Analyzing CLIP's Performance Limitations in Multi-Object Scenarios: A Controlled High-Resolution Study | Feb 27, 2025 | Image GenerationObject | —Unverified | 0 |
| SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning | Feb 27, 2025 | DiagnosticRepresentation Learning | —Unverified | 0 |
| Progressive Local Alignment for Medical Multimodal Pre-training | Feb 25, 2025 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs | Feb 25, 2025 | Representation Learningzero-shot-classification | —Unverified | 0 |