| Delving into the Openness of CLIP | Jun 4, 2022 | image-classificationImage Classification | CodeCode Available | 0 | 5 |
| RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models | Apr 21, 2023 | Cross-Modal RetrievalImage-text matching | CodeCode Available | 0 | 5 |
| PromptHash:Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval | Jan 1, 2025 | Contrastive LearningImage Retrieval | CodeCode Available | 0 | 5 |
| Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models | Feb 18, 2025 | Image to textOptical Character Recognition | CodeCode Available | 0 | 5 |
| BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval | Jun 14, 2024 | Image RetrievalImage to text | CodeCode Available | 0 | 5 |
| PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval | Mar 20, 2025 | Contrastive LearningCross-Modal Retrieval | CodeCode Available | 0 | 5 |
| Real-world validation of a multimodal LLM-powered pipeline for High-Accuracy Clinical Trial Patient Matching leveraging EHR data | Mar 19, 2025 | Image to text | CodeCode Available | 0 | 5 |
| Pragmatic Radiology Report Generation | Nov 28, 2023 | Image to text | CodeCode Available | 0 | 5 |
| A Data-Driven Guided Decoding Mechanism for Diagnostic Captioning | Jun 20, 2024 | DiagnosticImage to text | CodeCode Available | 0 | 5 |
| MultiQG-TI: Towards Question Generation from Multi-modal Sources | Jul 7, 2023 | Image to textOptical Character Recognition | CodeCode Available | 0 | 5 |