| Anatomy-Aware Conditional Image-Text Retrieval | Mar 10, 2025 | AnatomyContrastive Learning | —Unverified | 0 |
| Variance-Aware Loss Scheduling for Multimodal Alignment in Low-Data Settings | Mar 5, 2025 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning | Mar 4, 2025 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations | Mar 2, 2025 | image-classificationImage Classification | —Unverified | 0 |
| Progressive Local Alignment for Medical Multimodal Pre-training | Feb 25, 2025 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features | Feb 20, 2025 | FairnessImage-text Retrieval | —Unverified | 0 |
| Fine-tuning Multimodal Transformers on Edge: A Parallel Split Learning Approach | Feb 10, 2025 | Federated LearningImage-text Retrieval | —Unverified | 0 |
| DCFormer: Efficient 3D Vision-Language Modeling with Decomposed Convolutions | Feb 7, 2025 | Anomaly DetectionImage-text Retrieval | —Unverified | 0 |
| MASS: Overcoming Language Bias in Image-Text Matching | Jan 20, 2025 | Image-text matchingImage-text Retrieval | —Unverified | 0 |
| TSVC:Tripartite Learning with Semantic Variation Consistency for Robust Image-Text Retrieval | Jan 19, 2025 | Cross-Modal RetrievalImage-text Retrieval | —Unverified | 0 |