| M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining | Jan 29, 2024 | GPUzero-shot-classification | CodeCode Available | 0 |
| Data-Free Generalized Zero-Shot Learning | Jan 28, 2024 | Generalized Zero-Shot Learningzero-shot-classification | CodeCode Available | 0 |
| A comparative study of zero-shot inference with large language models and supervised modeling in breast cancer pathology classification | Jan 25, 2024 | Transfer Learningzero-shot-classification | —Unverified | 0 |
| Zoom-shot: Fast and Efficient Unsupervised Zero-Shot Transfer of CLIP to Vision Encoders with Multimodal Loss | Jan 22, 2024 | Knowledge Distillationzero-shot-classification | —Unverified | 0 |
| Enhancing medical vision-language contrastive learning via inter-matching relation modelling | Jan 19, 2024 | Contrastive LearningCross-Modal Retrieval | —Unverified | 0 |
| CLIP-Guided Source-Free Object Detection in Aerial Images | Jan 10, 2024 | Domain AdaptationObject | CodeCode Available | 1 |
| Benchmarking PathCLIP for Pathology Image Analysis | Jan 5, 2024 | BenchmarkingDecision Making | —Unverified | 0 |
| Improved Zero-Shot Classification by Adapting VLMs with Text Descriptions | Jan 4, 2024 | Fine-Grained Image Classificationimage-classification | CodeCode Available | 1 |
| Building Vision-Language Models on Solid Foundations with Masked Distillation | Jan 1, 2024 | Contrastive LearningKnowledge Distillation | —Unverified | 0 |
| Open-Pose 3D Zero-Shot Learning: Benchmark and Challenges | Dec 12, 2023 | 3D Object ClassificationClassification | CodeCode Available | 1 |
| Lite-Mind: Towards Efficient and Robust Brain Representation Network | Dec 6, 2023 | Brain DecodingImage Retrieval | CodeCode Available | 1 |
| SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference | Dec 4, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| Pipeline Enabling Zero-shot Classification for Bangla Handwritten Grapheme | Dec 1, 2023 | Bangla Text DetectionClassification | —Unverified | 0 |
| Explaining CLIP's performance disparities on data from blind/low vision users | Nov 29, 2023 | Few-Shot Learningzero-shot-classification | —Unverified | 0 |
| MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training | Nov 28, 2023 | Image CaptioningTransfer Learning | —Unverified | 0 |
| IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers | Nov 27, 2023 | Caption GenerationImage-text Retrieval | —Unverified | 0 |
| ViT-Lens: Towards Omni-modal Representations | Nov 27, 2023 | EEGImage Generation | CodeCode Available | 1 |
| Effective Backdoor Mitigation in Vision-Language Models Depends on the Pre-training Objective | Nov 25, 2023 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models | Nov 24, 2023 | Audio GenerationEvent Detection | —Unverified | 0 |
| Deep Learning and NLP in Cryptocurrency Forecasting: Integrating Financial, Blockchain, and Social Media Data | Nov 23, 2023 | Data IntegrationSentiment Analysis | —Unverified | 0 |
| Investigating the Emergent Audio Classification Ability of ASR Foundation Models | Nov 15, 2023 | Audio ClassificationDecoder | CodeCode Available | 0 |
| Gen-Z: Generative Zero-Shot Text Classification with Contextualized Label Descriptions | Nov 13, 2023 | ClassificationLanguage Modeling | —Unverified | 0 |
| CLAMP: A Contrastive Language And Molecule Pre-training Network | Nov 12, 2023 | Graph Neural Networkzero-shot-classification | CodeCode Available | 0 |
| Automatic Report Generation for Histopathology images using pre-trained Vision Transformers | Nov 10, 2023 | DecoderImage Segmentation | CodeCode Available | 0 |
| Generalized zero-shot audio-to-intent classification | Nov 4, 2023 | ClassificationGoal-Oriented Dialog | —Unverified | 0 |
| Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection | Nov 1, 2023 | ClassificationFew-Shot Object Detection | CodeCode Available | 1 |
| Using Large Language Models to Support Thematic Analysis in Empirical Legal Studies | Oct 28, 2023 | Language ModellingLarge Language Model | —Unverified | 0 |
| ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models | Oct 27, 2023 | Column Type AnnotationTable annotation | CodeCode Available | 1 |
| EmoCLIP: A Vision-Language Method for Zero-Shot Video Facial Expression Recognition | Oct 25, 2023 | Facial Expression RecognitionFacial Expression Recognition (FER) | CodeCode Available | 1 |
| Linear Representations of Sentiment in Large Language Models | Oct 23, 2023 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement | Oct 21, 2023 | Depth Estimationimage-classification | —Unverified | 0 |
| SILC: Improving Vision Language Pretraining with Self-Distillation | Oct 20, 2023 | ClassificationContrastive Learning | —Unverified | 0 |
| MedAI Dialog Corpus (MEDIC): Zero-Shot Classification of Doctor and AI Responses in Health Consultations | Oct 19, 2023 | Classificationtext-classification | —Unverified | 0 |
| Evaluating the Fairness of Discriminative Foundation Models in Computer Vision | Oct 18, 2023 | FairnessImage Captioning | CodeCode Available | 0 |
| Estimating Uncertainty in Multimodal Foundation Models using Public Internet Data | Oct 15, 2023 | Conformal PredictionPrediction | CodeCode Available | 0 |
| VeCLIP: Improving CLIP Training via Visual-enriched Captions | Oct 11, 2023 | Image-text RetrievalRetrieval | CodeCode Available | 2 |
| Uni3D: Exploring Unified 3D Representation at Scale | Oct 10, 2023 | 3D Object ClassificationRetrieval | CodeCode Available | 2 |
| Blind Dates: Examining the Expression of Temporality in Historical Photographs | Oct 10, 2023 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| Understanding the Robustness of Multi-modal Contrastive Learning to Distribution Shift | Oct 8, 2023 | Contrastive Learningzero-shot-classification | —Unverified | 0 |
| Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks | Oct 5, 2023 | Contrastive LearningData Poisoning | CodeCode Available | 0 |
| DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection | Oct 2, 2023 | Novel Object DetectionObject | CodeCode Available | 1 |
| Telling Stories for Common Sense Zero-Shot Action Recognition | Sep 29, 2023 | Action RecognitionArticles | CodeCode Available | 0 |
| CLIP-DIY: CLIP Dense Inference Yields Open-Vocabulary Semantic Segmentation For-Free | Sep 25, 2023 | Image SegmentationObject Localization | CodeCode Available | 1 |
| Exploiting CLIP-based Multi-modal Approach for Artwork Classification and Retrieval | Sep 21, 2023 | Retrievalzero-shot-classification | —Unverified | 0 |
| Auto-ACD: A Large-scale Dataset for Audio-Language Representation Learning | Sep 20, 2023 | Audio captioningCaption Generation | —Unverified | 0 |
| TAP: Targeted Prompting for Task Adaptive Generation of Textual Training Instances for Visual Classification | Sep 13, 2023 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 1 |
| Zero-Shot Visual Classification with Guided Cropping | Sep 12, 2023 | ClassificationObject | —Unverified | 0 |
| Mitigating Word Bias in Zero-shot Prompt-based Classifiers | Sep 10, 2023 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| Context-Aware Prompt Tuning for Vision-Language Model with Dual-Alignment | Sep 8, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ETP: Learning Transferable ECG Representations via ECG-Text Pre-training | Sep 6, 2023 | DiagnosticLanguage Modeling | —Unverified | 0 |