| Hyperbolic Image-Text Representations | Apr 18, 2023 | image-classificationImage Classification | CodeCode Available | 1 |
| Equivariant Similarity for Vision-Language Foundation Models | Mar 25, 2023 | Image-text RetrievalRetrieval | CodeCode Available | 1 |
| Multimodal Federated Learning via Contrastive Representation Ensemble | Feb 17, 2023 | Federated LearningImage-text Retrieval | CodeCode Available | 1 |
| UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling | Feb 13, 2023 | Image-text RetrievalRetrieval | CodeCode Available | 1 |
| LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval | Feb 6, 2023 | Image-text RetrievalRetrieval | CodeCode Available | 1 |
| UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers | Jan 31, 2023 | Image CaptioningImage Classification | CodeCode Available | 1 |
| LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval | Jan 1, 2023 | image-classificationImage Classification | CodeCode Available | 1 |
| Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift | Dec 15, 2022 | BenchmarkingImage Captioning | CodeCode Available | 1 |
| FlexiViT: One Model for All Patch Sizes | Dec 15, 2022 | AllImage-text Retrieval | CodeCode Available | 1 |
| ComCLIP: Training-Free Compositional Image and Text Matching | Nov 25, 2022 | Image-text matchingImage-text Retrieval | CodeCode Available | 1 |