| Zoom-shot: Fast and Efficient Unsupervised Zero-Shot Transfer of CLIP to Vision Encoders with Multimodal Loss | Jan 22, 2024 | Knowledge Distillationzero-shot-classification | —Unverified | 0 |
| MixNorm: Test-Time Adaptation Through Online Normalization Estimation | Oct 21, 2021 | Domain AdaptationTest-time Adaptation | —Unverified | 0 |
| MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training | Nov 28, 2023 | Image CaptioningTransfer Learning | —Unverified | 0 |
| Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use | Mar 5, 2024 | image-classificationImage Classification | —Unverified | 0 |
| Multi-label Zero-Shot Audio Classification with Temporal Attention | Aug 31, 2024 | Audio ClassificationClassification | —Unverified | 0 |
| Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge | Jul 30, 2020 | ClassificationGeneral Classification | —Unverified | 0 |
| Multi-Modal Generative Embedding Model | May 29, 2024 | Caption GenerationCross-Modal Retrieval | —Unverified | 0 |
| Multi-modal Relation Distillation for Unified 3D Representation Learning | Jul 19, 2024 | RelationRepresentation Learning | —Unverified | 0 |
| Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio | Dec 23, 2024 | Contrastive LearningPrompt Learning | —Unverified | 0 |
| NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics | Nov 11, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |