| Cross-Modal Retrieval Meets Inference:Improving Zero-Shot Classification with Cross-Modal Retrieval | Aug 29, 2023 | Cross-Modal Retrievalimage-classification | —Unverified | 0 |
| Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment | Aug 24, 2023 | Self-Learningzero-shot-classification | CodeCode Available | 1 |
| Adversarial Illusions in Multi-Modal Embeddings | Aug 22, 2023 | Image GenerationText Generation | CodeCode Available | 1 |
| Image-free Classifier Injection for Zero-Shot Classification | Aug 21, 2023 | ClassificationDecoder | CodeCode Available | 1 |
| DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability | Aug 18, 2023 | Image Generationzero-shot-classification | —Unverified | 0 |
| Robustifying Point Cloud Networks by Refocusing | Aug 10, 2023 | 3D ClassificationAdversarial Defense | CodeCode Available | 0 |
| ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free Domain Adaptation | Aug 4, 2023 | Domain Adaptationimage-classification | CodeCode Available | 1 |
| PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts | Aug 2, 2023 | Classificationimage-classification | CodeCode Available | 1 |
| Developing and Evaluating Tiny to Medium-Sized Turkish BERT Models | Jul 26, 2023 | ClassificationComputational Efficiency | —Unverified | 0 |
| PRIOR: Prototype Representation Joint Learning from Medical Images and Reports | Jul 24, 2023 | Contrastive LearningImage to text | CodeCode Available | 1 |
| MineralImage5k: A benchmark for zero-shot raw mineral visual recognition and description | Jul 20, 2023 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 1 |
| Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP | Jul 18, 2023 | AttributeImage-text Retrieval | —Unverified | 0 |
| RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing | Jun 20, 2023 | Cross-Modal RetrievalImage Retrieval | CodeCode Available | 2 |
| RemoteCLIP: A Vision Language Foundation Model for Remote Sensing | Jun 19, 2023 | ClassificationCross-Modal Retrieval | CodeCode Available | 2 |
| Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation | Jun 14, 2023 | AttributeKnowledge Graphs | —Unverified | 0 |
| Improving Zero-Shot Detection of Low Prevalence Chest Pathologies using Domain Pre-trained Language Models | Jun 13, 2023 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| Analysis of the Fed's communication by using textual entailment model of Zero-Shot classification | Jun 7, 2023 | Natural Language InferenceSentiment Analysis | —Unverified | 0 |
| UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of Multilingual BERT for Low-resource Sentiment Analysis | Jun 1, 2023 | Contrastive LearningRepresentation Learning | CodeCode Available | 1 |
| Multi-level Cross-modal Feature Alignment via Contrastive Learning towards Zero-shot Classification of Remote Sensing Image Scenes | May 31, 2023 | ClassificationContrastive Learning | CodeCode Available | 0 |
| Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning | May 31, 2023 | Decision MakingGeneral Knowledge | CodeCode Available | 2 |
| Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models | May 29, 2023 | Image CaptioningImage Classification | CodeCode Available | 1 |
| Improved Probabilistic Image-Text Representations | May 29, 2023 | Data AugmentationImage-text matching | CodeCode Available | 1 |
| Adapting Language-Audio Models as Few-Shot Audio Learners | May 28, 2023 | Audio ClassificationClassification | —Unverified | 0 |
| DiffCLIP: Leveraging Stable Diffusion for Language Grounded 3D Classification | May 25, 2023 | 3D ClassificationClassification | —Unverified | 0 |
| OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning | May 24, 2023 | Data AugmentationFact Checking | CodeCode Available | 0 |
| S-CLIP: Semi-supervised Vision-Language Learning using Few Specialist Captions | May 23, 2023 | Contrastive LearningImage-text Retrieval | CodeCode Available | 1 |
| Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science | May 23, 2023 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| Parts of Speech-Grounded Subspaces in Vision-Language Models | May 23, 2023 | Image GenerationPOS | CodeCode Available | 1 |
| LLM-Pruner: On the Structural Pruning of Large Language Models | May 19, 2023 | Text Generationzero-shot-classification | CodeCode Available | 3 |
| MedBLIP: Bootstrapping Language-Image Pre-training from 3D Medical Images and Texts | May 18, 2023 | Medical Visual Question AnsweringQuestion Answering | CodeCode Available | 1 |
| ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding | May 14, 2023 | 3D Classification3D Point Cloud Classification | CodeCode Available | 2 |
| Boosting Visual-Language Models by Exploiting Hard Samples | May 9, 2023 | Retrievalzero-shot-classification | CodeCode Available | 0 |
| The Benefits of Label-Description Training for Zero-Shot Text Classification | May 3, 2023 | Classificationdomain classification | CodeCode Available | 0 |
| Unsupervised Improvement of Audio-Text Cross-Modal Representations | May 3, 2023 | Acoustic Scene ClassificationClassification | CodeCode Available | 0 |
| The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks | Apr 26, 2023 | Data AugmentationLanguage Modelling | CodeCode Available | 1 |
| CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval | Apr 21, 2023 | Data AugmentationInformation Retrieval | CodeCode Available | 0 |
| WYTIWYR: A User Intent-Aware Framework with Multi-modal Inputs for Visualization Retrieval | Apr 14, 2023 | Retrievalzero-shot-classification | CodeCode Available | 0 |
| SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval) | Apr 13, 2023 | ClassificationSentiment Analysis | CodeCode Available | 1 |
| What does CLIP know about a red circle? Visual prompt engineering for VLMs | Apr 13, 2023 | Image GenerationPrompt Engineering | —Unverified | 0 |
| RECLIP: Resource-efficient CLIP by Training with Small Images | Apr 12, 2023 | Contrastive LearningImage-text Retrieval | —Unverified | 0 |
| Exploring Vision-Language Models for Imbalanced Learning | Apr 4, 2023 | Decoderzero-shot-classification | CodeCode Available | 1 |
| SoftCLIP: Softer Cross-modal Alignment Makes CLIP Stronger | Mar 30, 2023 | cross-modal alignmentzero-shot-classification | —Unverified | 0 |
| Your Diffusion Model is Secretly a Zero-Shot Classifier | Mar 28, 2023 | Domain GeneralizationFine-Grained Image Classification | CodeCode Available | 2 |
| Evaluation of ChatGPT for NLP-based Mental Health Applications | Mar 28, 2023 | ClassificationDepression Detection | —Unverified | 0 |
| Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection | Mar 25, 2023 | Decoderobject-detection | —Unverified | 0 |
| Frozen Language Model Helps ECG Zero-Shot Learning | Mar 22, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification | Mar 13, 2023 | Job classificationPrompt Engineering | —Unverified | 0 |
| Robust Contrastive Language-Image Pre-training against Data Poisoning and Backdoor Attacks | Mar 13, 2023 | Backdoor AttackData Poisoning | CodeCode Available | 1 |
| Exploiting the Textual Potential from Vision-Language Pre-training for Text-based Person Search | Mar 8, 2023 | AttributePerson Search | —Unverified | 0 |
| Describe me an Aucklet: Generating Grounded Perceptual Category Descriptions | Mar 7, 2023 | nlg evaluationRepresentation Learning | CodeCode Available | 0 |