| RETSim: Resilient and Efficient Text Similarity | Nov 28, 2023 | Adversarial TextClustering | CodeCode Available | 4 |
| FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization | Jan 17, 2025 | Anomaly DetectionImage-text matching | CodeCode Available | 2 |
| Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP | Jun 25, 2024 | cross-modal alignmentImage Classification | CodeCode Available | 2 |
| MeaCap: Memory-Augmented Zero-shot Image Captioning | Mar 6, 2024 | Caption GenerationImage Captioning | CodeCode Available | 2 |
| CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification | Feb 27, 2024 | ClassificationDiagnostic | CodeCode Available | 2 |
| Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval | Mar 22, 2023 | Image-text matchingLanguage Modeling | CodeCode Available | 2 |
| CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation | Mar 21, 2023 | Image SegmentationOpen Vocabulary Semantic Segmentation | CodeCode Available | 2 |
| A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks | Jun 17, 2022 | text similarity | CodeCode Available | 2 |
| Fine-grained Image Captioning with CLIP Reward | May 26, 2022 | Caption GenerationDescriptive | CodeCode Available | 2 |
| GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing | May 16, 2025 | Instruction FollowingMultiple-choice | CodeCode Available | 1 |