| Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese | Nov 2, 2022 | Contrastive Learningimage-classification | CodeCode Available | 5 |
| General Image Descriptors for Open World Image Retrieval using ViT CLIP | Oct 20, 2022 | Image RetrievalRetrieval | CodeCode Available | 1 |
| ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-training | Sep 30, 2022 | Computational EfficiencyContrastive Learning | CodeCode Available | 0 |
| FETA: Towards Specializing Foundation Models for Expert Task Applications | Sep 8, 2022 | Domain GeneralizationFew-Shot Learning | CodeCode Available | 1 |
| Curriculum Learning for Data-Efficient Vision-Language Alignment | Jul 29, 2022 | Contrastive LearningImage Retrieval | —Unverified | 0 |
| Cross-lingual and Multilingual CLIP | Jun 1, 2022 | Contrastive LearningImage-text Retrieval | CodeCode Available | 2 |
| CCMB: A Large-scale Chinese Cross-modal Benchmark | May 8, 2022 | image-classificationImage Classification | CodeCode Available | 1 |
| Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark | Feb 14, 2022 | BenchmarkingContrastive Learning | CodeCode Available | 0 |
| Visual Representation Learning with Self-Supervised Attention for Low-Label High-data Regime | Jan 22, 2022 | Few-Shot Image Classificationimage-classification | CodeCode Available | 0 |
| FLAVA: A Foundational Language And Vision Alignment Model | Dec 8, 2021 | Image RetrievalImage-to-Text Retrieval | CodeCode Available | 1 |