| PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining | Apr 29, 2022 | Image ClassificationLanguage Modeling | —Unverified | 0 |
| Robust Cross-Modal Representation Learning with Progressive Self-Distillation | Apr 10, 2022 | Contrastive LearningImage Captioning | —Unverified | 0 |
| An Analysis of Semantically-Aligned Speech-Text Embeddings | Apr 4, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Geodesic Multi-Modal Mixup for Robust Fine-Tuning | Mar 8, 2022 | Image Captioningzero-shot-classification | CodeCode Available | 0 |
| Universal Prototype Transport for Zero-Shot Action Recognition and Localization | Mar 8, 2022 | Action RecognitionObject | —Unverified | 0 |
| Zero-Shot and Few-Shot Classification of Biomedical Articles in Context of the COVID-19 Pandemic | Jan 9, 2022 | ArticlesMulti-Task Learning | —Unverified | 0 |
| A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision | Dec 27, 2021 | ClassificationImage Captioning | —Unverified | 0 |
| Learning Aligned Cross-Modal Representation for Generalized Zero-Shot Classification | Dec 24, 2021 | Classificationzero-shot-classification | —Unverified | 0 |
| 3D Compositional Zero-shot Learning with DeCompositional Consensus | Nov 29, 2021 | BenchmarkingCompositional Zero-Shot Learning | —Unverified | 0 |
| Make an Omelette with Breaking Eggs: Zero-Shot Learning for Novel Attribute Synthesis | Nov 28, 2021 | AttributeClassification | —Unverified | 0 |