| CCMB: A Large-scale Chinese Cross-modal Benchmark | May 8, 2022 | image-classificationImage Classification | CodeCode Available | 1 | 5 |
| Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark | Feb 14, 2022 | BenchmarkingContrastive Learning | CodeCode Available | 0 | 5 |
| Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation Models | Jan 23, 2025 | Image RetrievalRetrieval | CodeCode Available | 0 | 5 |
| Visual Representation Learning with Self-Supervised Attention for Low-Label High-data Regime | Jan 22, 2022 | Few-Shot Image Classificationimage-classification | CodeCode Available | 0 | 5 |
| M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining | Jan 29, 2024 | GPUzero-shot-classification | CodeCode Available | 0 | 5 |
| Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers | Jan 31, 2021 | Image RetrievalRetrieval | CodeCode Available | 0 | 5 |
| ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-training | Sep 30, 2022 | Computational EfficiencyContrastive Learning | CodeCode Available | 0 | 5 |
| Learning with Succinct Common Representation Based on Wyner's Common Information | May 27, 2019 | Density Ratio EstimationImage Retrieval | —Unverified | 0 | 0 |
| Zero-Shot Hashing via Transferring Supervised Knowledge | Jun 16, 2016 | Image RetrievalRetrieval | —Unverified | 0 | 0 |
| Piecewise-Linear Manifolds for Deep Metric Learning | Mar 22, 2024 | Image RetrievalMetric Learning | —Unverified | 0 | 0 |